[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[gnuastro-commits] master 59620759: Book: Zeropoint tutorial brought int
From: |
Mohammad Akhlaghi |
Subject: |
[gnuastro-commits] master 59620759: Book: Zeropoint tutorial brought into the Tutorials chapter |
Date: |
Thu, 20 Jul 2023 09:01:29 -0400 (EDT) |
branch: master
commit 5962075961cd2c94baf726f18999819dcbb5926f
Author: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Commit: Mohammad Akhlaghi <mohammad@akhlaghi.org>
Book: Zeropoint tutorial brought into the Tutorials chapter
Until now, the tutorial describing how to obtain the zero point of an image
was located within the documentation of 'astscript-zeropoint'. This made it
hard to locate, and separate from the other tutorials.
With this commit, it has been brought up to the "Tutorials" chapter with
all the rest of the tutorials.
---
NEWS | 2 +
doc/gnuastro.texi | 35897 ++++++++++++++++++++++++++--------------------------
2 files changed, 17962 insertions(+), 17937 deletions(-)
diff --git a/NEWS b/NEWS
index dfff97c6..067467b5 100644
--- a/NEWS
+++ b/NEWS
@@ -20,6 +20,8 @@ See the end of the file for license conditions.
pattern using the newly added installed script in Gnuastro for
simulating the exposure map of a dither pattern stack (it is called as
'astscript-dither-simulate' on the command-line).
+ - Smaller tutorials that were distributed within the documentation of
+ different programs are brought into the "Tutorials" chapter.
- New "Standard deviation vs. error" sub-section added under the
MakeCatalog section. It uses real examples to clearly show the
fundamental difference between the two (which are sometimes confused
diff --git a/doc/gnuastro.texi b/doc/gnuastro.texi
index 1b842311..5150aa0d 100644
--- a/doc/gnuastro.texi
+++ b/doc/gnuastro.texi
@@ -274,6 +274,7 @@ Tutorials
* Sufi simulates a detection:: Simulating a detection.
* Detecting lines and extracting spectra in 3D data:: Extracting spectra and
emission line properties.
* Color channels in same pixel grid:: Aligning images to same grid to build
color image.
+* Zero point of an image:: Estimate the zero point of an image.
* Dither pattern design:: Simulate the output image of a given dither
pattern.
General program usage tutorial
@@ -331,6 +332,11 @@ Detecting lines and extracting spectra in 3D data
* Extracting a single spectrum and plotting it:: Extracting a single vector
row.
* Pseudo narrow-band images:: Collapsing the third dimension into a 2D image.
+Zero point of an image
+
+* Zero point tutorial with reference image:: Using a reference image.
+* Zero point tutorial with reference catalog:: Using a reference catalog.
+
Installation
* Dependencies:: Necessary packages for Gnuastro.
@@ -770,8 +776,6 @@ Viewing FITS file contents with DS9 or TOPCAT
Zero point estimation
-* Zero point tutorial with reference image:: Using SDSS images to find J-PLUS
zero point
-* Zero point tutorial with reference catalog:: Using SDSS catalog to find
J-PLUS zero point
* Invoking astscript-zeropoint:: How to call the script
Invoking astscript-zeropoint
@@ -2013,6 +2017,12 @@ Showing how Abd al-rahman Sufi (903 -- 986 A.D., the
first recorded description
Because all conditions are under control in a simulated/mock
environment/dataset, mock datasets can be a valuable tool to inspect the
limitations of your data analysis and processing.
But they need to be as realistic as possible, so this tutorial is dedicated to
this important step of an analysis (simulations).
+There are other tutorials also, on things that are commonly necessary in
astronomical research:
+In @ref{Detecting lines and extracting spectra in 3D data}, we use MUSE cubes
(an IFU dataset) to show how you can subtract the continuum, detect
emission-line features, extract spectra and build pseudo narrow-band images.
+In @ref{Color channels in same pixel grid} we demonstrate how you can warp
multiple images into a single pixel grid (often necessary with mult-wavelength
data), and build a single color image.
+In @ref{Zero point of an image} we review the process of estimating the zero
point of an image using a reference image or catalog.
+Finally, in @ref{Dither pattern design} we show the process by which you can
simulate a dither pattern to find the best observing strategy for your next
exciting scientific project.
+
In these tutorials, we have intentionally avoided too many cross references to
make it more easy to read.
For more information about a particular program, you can visit the section
with the same name as the program in this book.
Each program section in the subsequent chapters starts by explaining the
general concepts behind what it does, for example, see @ref{Convolve}.
@@ -2026,6 +2036,7 @@ For an explanation of the conventions we use in the
example codes through the bo
* Sufi simulates a detection:: Simulating a detection.
* Detecting lines and extracting spectra in 3D data:: Extracting spectra and
emission line properties.
* Color channels in same pixel grid:: Aligning images to same grid to build
color image.
+* Zero point of an image:: Estimate the zero point of an image.
* Dither pattern design:: Simulate the output image of a given dither
pattern.
@end menu
@@ -8680,7 +8691,7 @@ Click on the scroll-down menu in front of ``Table'' and
select @file{2: collapse
Afterwards, you will see the optimized pseudo-narrow-band image radial profile
as blue points.
@end enumerate
-@node Color channels in same pixel grid, Dither pattern design, Detecting
lines and extracting spectra in 3D data, Tutorials
+@node Color channels in same pixel grid, Zero point of an image, Detecting
lines and extracting spectra in 3D data, Tutorials
@section Color channels in same pixel grid
In order to use different images as color channels, it is important that the
images be properly aligned and on the same pixel grid.
@@ -8747,17290 +8758,17335 @@ This shows how green and red channels have been
slightly shifted to put your ast
If you don't want to have those, or if you want the outer parts of the final
image (where there was no data) to be white, some more complex commands are
necessary.
We'll leave those as an exercise for you to try your self using @ref{Warp}
and/or @ref{Crop} to pre-process the inputs before converting it to a color
image.
-@node Dither pattern design, , Color channels in same pixel grid, Tutorials
-@section Dither pattern design
-
-@cindex Dithering
-Deciding a suitable dithering pattern is one of the most important steps when
planning your observation strategy.
-Dithering is the process of moving each exposure compared to the previous one
and is done for several reasons like increasing resolution, expending the area
of the observation and etc.
-For a more complete introduction to dithering, see @ref{Dithering pattern
simulation}.
-Gnuastro has a script (@command{astscript-dither-simulate}) for simplifying
the process of choosing the best dither pattern to optimizing your observation
strategy for your scientific goal.
-
-In this tutorial, let's assume you want to observe
@url{https://en.wikipedia.org/wiki/Messier_94, M94} in the H-alpha and rSDSS
filters@footnote{For the full list of available filters, see the
@url{https://oaj.cefca.es/telescopes/t80cam, T80Cam description}.} (to study
the extended star formation in the outer rings of this beautiful galaxy!).
-Including the outer parts of the rings, the galaxy is half a degree in
diameter!
-This is very large, and you want to design a dither pattern that will cover
this area with the maximum depth!
-Therefore, you need an instrument with a large field of view.
-Let's assume that after some searching, you decide to write a proposal for the
@url{https://oaj.cefca.es/telescopes/jast80, JAST80 telescope} at the
@url{https://oaj.cefca.es, Observatorio Astrofísico de Javalambre},
OAJ@footnote{For full disclosure, Gnuastro is being developed at CEFCA (Centro
de Estudios de F@'isica del Cosmos de Arag@'on); which also hosts OAJ.}, in
Teruel (Spain).
-The field of view of this telescope's camera is almost 1.5 degrees wide,
nicely fitting M94!
-
-Before we start, as described in @ref{Dithering pattern simulation}, it is
just important to remember that the ideal dither pattern depends primarily on
your scientific objective, as well as the limitations of the instrument you are
observing with.
-Therefore, there is no single dither pattern for all purposes.
-However, the tools, methods, criteria or logic to check if your dither pattern
satisfies your scientific requirement are similar.
-Therefore, you can use the same methods, tools or logic here to simulate or
verify that your dither pattern will produce the products you expect after the
observation.
-The hypothetical scenario above is just an example to show the usage.
-As with any tutorial, do not blindly follow the same final solution for any
scenario; this is just an example to show you @emph{how to} find your own
solution, not to give you the universal solution to any scenario!
-
-To start simulating a dither pattern for a certain telescope, you just need a
single-exposure image of that telescope with WCS information.
-In other words, after astrometry, but before warping into any other pixel grid
(to combine into a deeper stack).
-The image will give us the default number of the camera's pixels, its pixel
scale (width of pixel in arcseconds) and the camera distortion.
-These are reference parameters that are independent of the position of the
image on the sky.
-
-Because the actual position of the reference image is irrelevant, let's assume
that in a previous project, persumably on
@url{https://en.wikipedia.org/wiki/NGC_4395, NGC 4395}, you already had the
download command of the following single exposure image:
+@node Zero point of an image, Dither pattern design, Color channels in same
pixel grid, Tutorials
+@section Zero point of an image
-@example
-$ mkdir dither-tutorial
-$ cd dither-tutorial
-$ mkdir input
-$ siapurl=https://archive.cefca.es/catalogues/vo/siap
-$ wget $siapurl/jplus-dr3/reduced/get_fits?id=1050345 \
- -O input/jplus-1050345.fits.fz
+The ``zero point'' of an image is astronomical jargon for the calibration
factor of its pixel values; allowing us to convert the raw pixel values to
physical units.
+It is therefore a critical step during data reduction.
+For more on the definition and importance of the zero point magnitude, see
@ref{Brightness flux magnitude} and @ref{Zero point estimation}.
-$ astscript-fits-view input/jplus-1050345.fits.fz
-@end example
+@cindex SDSS
+In this tutorial, we will use Gnuastro's @command{astscript-zeropoint}, to
estimate the zero point of a single exposure image from the
@url{https://www.j-plus.es, J-PLUS survey}, while using an
@url{http://www.sdss.org, SDSS} image as reference (recall that all SDSS images
have been calibrated to have a fixed zero point of 22.5).
+In this case, both images that we are using were taken with the SDSS @emph{r}
filter.
@cartouche
+@cindex Johnson filters
+@cindex Johnson vs. SDSS filters
+@cindex SDSS vs. Johnson filters
+@cindex Filter transmission curve
+@cindex Transmission curve of filters
+@cindex SVO database (filter transmission curve)
@noindent
-@strong{This is the first time I am using an instrument:} In case you haven't
already used images from your desired instrument (to use as reference), you can
find such images from their public archives; or contacting them.
-A single exposure images is rarely of any scientific value (post-processing
and stacking is necessary to make high-level and science-ready products).
-Therefore, they become publicly available very soon after the observation
date; furthermore, calibration images are usually public immediately.
+@strong{Same filters and SVO filter database:} It is very important that both
your images are taken with the same filter.
+When looking at filter names, don't forget that different filter systems
sometimes have the same names for one filter, such as the name ``R''; which is
used in both the Johnson and SDSS filter systems.
+Hence if you confront an image in the ``R'' or ``r'' filter, double check to
see exactly which filter system it corresponds to.
+If you know which observatory your data came from, you can use the
@url{http://svo2.cab.inta-csic.es/theory/fps, SVO database} to confirm the
similarity of the transmission curves of the filters of your input and
reference images.
+SVO contains the filter data for many of the observatories world-wide.
@end cartouche
-As you see from the image above, the T80Cam images are large (9216 by 9232
pixels).
-Therefore, to speed up the dither testing, let's down-sample the image above
by a factor of 10.
-This step is optional and you can safely use the full resolution, which will
give you a more precise stack, but which will be much slower (maybe good after
you have an approximate solution on the down-sampled image).
-We will call the output @file{ref.fits} (since it is the ``reference'' for our
test).
-We are putting these two ``input'' files (to the script) in a dedicated
directory to keep the running directory clean (be able to easily delete
temporary/test files for a fresh start with a `@command{rm *.fits}').
+@menu
+* Zero point tutorial with reference image:: Using a reference image.
+* Zero point tutorial with reference catalog:: Using a reference catalog.
+@end menu
-@example
-$ astwarp input/jplus-1050345.fits.fz --scale=1/10 -oinput/ref.fits
-@end example
+@node Zero point tutorial with reference image, Zero point tutorial with
reference catalog, Zero point of an image, Zero point of an image
+@subsection Zero point tutorial with reference image
-For a first trial, let's create a cross-shaped dither pattern around M94
(which is centered at its center on the RA and Dec of 192.721250, 41.120556).
-We'll center one exposure on the center of the galaxy, and include 4 more
exposures that are each 1 arc-minute away along the RA and Dec axes.
-To simplify the actual command later@footnote{Instead of this, later, when you
called @command{astscript-dither-simulate}, you could pass the
@option{--racol=1} and @option{--deccol=2} options.
-But having metadata is always preferred (will avoid many bugs/frustrations in
the long-run!).}, let's also include the column names through two lines of
metadata.
+First, let’s create a directory named @file{tutorial-zeropoint} to keep things
clean and work in that.
+Then, with the commands below, you can download an image from J-PLUS and SDSS.
+To speed up the analysis, the image is cropped to have a smaller region around
its center.
@example
-$ step_arcmin=1
-$ center_ra=192.721250
-$ center_dec=41.120556
-
-$ echo "# Column 1: RA [deg, f64] Right Ascension" > dither.txt
-$ echo "# Column 2: Dec [deg, f64] Declination" >> dither.txt
-
-$ echo $center_ra $center_dec \
- | awk '@{s='$step_arcmin'/60; fmt="%-10.6f %-10.6f\n"; \
- printf fmt, $1, $2; \
- printf fmt, $1+s, $2; \
- printf fmt, $1, $2+s; \
- printf fmt, $1-s, $2; \
- printf fmt, $1, $2-s@}' \
- >> dither.txt
-
-$ cat dither.txt
-# Column 1: RA [deg, f64] Right Ascension
-# Column 2: Dec [deg, f64] Declination
-192.721250 41.120556
-192.804583 41.120556
-192.721250 41.203889
-192.637917 41.120556
-192.721250 41.037223
+$ mkdir tutorial-zeropoint
+$ cd tutorial-zeropoint
+$ jplusdr2=http://archive.cefca.es/catalogues/vo/siap/jplus-dr2/reduced
+$ wget $jplusdr2/get_fits?id=771463 -O jplus.fits.fz
+$ astcrop jplus.fits.fz --center=107.7263,40.1754 \
+ --width=0.6 --output=jplus-crop.fits
@end example
-We are now ready to generate the exposure map of the dither pattern above
using the reference image that we made before it.
-Let's put the center of our final stack to be on the center of the galaxy, and
we'll assume the stack has a size of 2 degrees.
-With the second command, you can see the exposure map of the final stack.
-Recall that in this image, each pixel shows the number of input images that
went into it.
+Although we cropped the J-PLUS image, it is still very large in comparison
with the SDSS image (the J-PLUS field of view is almost @mymath{1.5\times1.5}
deg@mymath{^2}, while the field of view of SDSS in each filter is almost
@mymath{0.3\times0.5} deg@mymath{^2}).
+Therefore, let's download two SDSS images (and then decompress them) in the
region of the cropped J-PLUS image to have a more accurate result compared to a
single SDSS footprint: generally, your zero point estimation will have less
scatter with more overlap between your reference image(s) and your input image.
@example
-$ astscript-dither-simulate dither.txt --output=stack.fits \
- --img=input/ref.fits --center=$center_ra,$center_dec \
- --width=2
-
-$ astscript-fits-view stack.fits
+$ sdssbase=https://dr12.sdss.org/sas/dr12/boss/photoObj/frames
+$ wget $sdssbase/301/6509/5/frame-r-006509-5-0115.fits.bz2 \
+ -O sdss1.fits.bz2
+$ wget $sdssbase/301/6573/5/frame-r-006573-5-0174.fits.bz2 \
+ -O sdss2.fits.bz2
+$ bunzip2 sdss1.fits.bz2
+$ bunzip2 sdss2.fits.bz2
@end example
-Because the step size is so small (compared to the field of view), we see that
except for a thin boundary, we almost have 5 exposures over the full field of
view.
-Let's see what the width of the deepest part of the image is.
-First, we'll use Arithmetic to set all pixels that contain less than 5
exposures to NaN (the outer pixels).
-In the same Arithmetic command, we'll trim all the blank rows and columns to
only contain the non-blank region.
-Afterwards, we'll view the deep region with the second command.
-Finally, with the third command below, we'll use the @option{--skycoverage}
option of the Fits program to see the coverage of deep part on the sky.
+To have a feeling of the data, let's open the three images with
@command{astscript-fits-view} using the command below.
+Wait a few seconds to see the three images ``blinking'' one after another.
+The largest one is the J-PLUS crop and the two smaller ones that partially
cover it in different regions are from SDSS.
@example
-$ deep_thresh=5
-$ astarithmetic stack.fits set-s s s $deep_thresh lt nan where trim \
- --output=deep.fits
-
-$ astscript-fits-view deep.fits
-
-$ astfits deep.fits --skycoverage
-...
-Sky coverage by center and (full) width:
- Center: 192.72125 41.120556
- Width: 1.880835157 1.392461166
-...
+$ astscript-fits-view sdss1.fits sdss2.fits jplus-crop.fits \
+ --ds9extra="-lock frame wcs -single -zoom to fit -blink yes"
@end example
-@cindex Sky value
-@cindex Flat field
-As we see, the width of this deep field is about 1.4 degrees (in Dec; the
coverage in RA depends on the Dec).
-This nicely covers the outers parts of M94.
-However, there is a problem: with a step size of 1 arc-minute, the brighter
central parts of this large galaxy will always be on very similar pixels;
making it hard to calibrate those pixels properly.
-If you are interested in the low surface brightness parts of this galaxy, it
is even worse: the outer parts of the galaxy will always cover similar parts of
the detector in all the exposures.
-To be able to accurately calibrate the image (in particular to estimate the
flat field pattern and subtract the sky), you do not want this to happen!
-You want each exposure to cover very different sources of astrophysical
signal, so you can accurately calibrate the instrument (for example flat field)
and natural (for example the Sky) artifacts.
+The test above showed that the three images are already astrometrically
calibrated (the coverage of the pixel positions on the sky is correct in both).
+To confirm, you can zoom-in to a certain object and confirm it on a pixel
level.
+It is always good to do the visual check above when you are confronted with
new images (and may not be confident about the accuracy of the astrometry).
+Do not forget that the goal here is to find the calibration of pixel values;
and that we assume pixel positions are already calibrated (the image already
has a good astrometry).
-Before we consider other alternatives, let's first get an accurate measure of
the area that is covered by all 5 exposures.
-A first hint would be to simply multiply the widths along RA and Dec reported
above: @mymath{1.8808\times1.3924=2.6189} degrees squared.
+The SDSS images are Sky subtracted, while this single-exposure J-PLUS image
still contains the counts related to the Sky emission within them.
+In the J-PLUS survey, the sky-level in each pixel is kept in a separate
@code{BACKGROUND_MODEL} HDU of @file{jplus.fits.fz}; this allows you to use a
different sky if you like.
+The SDSS image FITS files also have multiple extensions.
+To understand our inputs, let's have a fast look at the basic info of each:
-However, the sky coverage reported above has two caveates:
-1) it doesn't take into account the blank pixels (NaN) that are on the four
corners of the @file{deep.fits} image.
-2) the differing area of the pixels on the spherical sky in relation to those
blank values can result in wrong estimations of the area.
-So far, these blank areas are very small (and do not constitute a large
portion of the image).
-As a result, these effects are negligible.
-However, as you get more creative with the dither pattern to optimize your
scientific goal, such blank areas will cover a larger fraction of your final
stack.
+@example
+$ astfits sdss1.fits
+Fits (GNU Astronomy Utilities) @value{VERSION}
+Run on Fri Apr 14 11:24:03 2023
+-----
+HDU (extension) information: 'sdss1.fits'.
+ Column 1: Index (counting from 0, usable with '--hdu').
+ Column 2: Name ('EXTNAME' in FITS standard, usable with '--hdu').
+ ('n/a': no name in HDU metadata)
+ Column 3: Image data type or 'table' format (ASCII or binary).
+ Column 4: Size of data in HDU.
+ Column 5: Units of data in HDU (only images).
+ ('n/a': no unit in HDU metadata, or HDU is a table)
+-----
+0 n/a float32 2048x1489 nanomaggy
+1 n/a float32 2048 n/a
+2 n/a table_binary 1x3 n/a
+3 n/a table_binary 1x31 n/a
-So let's get a very accurate estimation of the area that will not be affected
by the issues above.
-With the first command below, we'll use the @option{--pixelareaonwcs} option
of the Fits program that will return the area of each pixel (in pixel units of
degrees squared).
-After running the second command, please have a look at the output of the
first:
-@example
-$ astfits deep.fits --pixelareaonwcs --output=deep-pix-area.fits
-$ astscript-fits-view deep-pix-area.fits
+$ astfits jplus.fits.fz
+Fits (GNU Astronomy Utilities) @value{VERSION}
+Run on Fri Apr 14 11:21:30 2023
+-----
+HDU (extension) information: 'jplus.fits.fz'.
+ Column 1: Index (counting from 0, usable with '--hdu').
+ Column 2: Name ('EXTNAME' in FITS standard, usable with '--hdu').
+ ('n/a': no name in HDU metadata)
+ Column 3: Image data type or 'table' format (ASCII or binary).
+ Column 4: Size of data in HDU.
+ Column 5: Units of data in HDU (only images).
+ ('n/a': no unit in HDU metadata, or HDU is a table)
+-----
+0 n/a no-data 0 n/a
+1 IMAGE float32 9216x9232 adu
+2 MASKED_PIXELS int16 9216x9232 n/a
+3 BACKGROUND_MODEL float32 9216x9232 n/a
+4 MASK_MODEL uint8 9216x9232 n/a
@end example
-@cindex Gnomonic projection
-The gradient you see in this image (that gets slightly curved towards the top
of the image) is the effect of the default
@url{https://en.wikipedia.org/wiki/Gnomonic_projection, Gnomonic projection}
(summarized as @code{TAN} in the FITS WCS standard).
-Since this image is aligned with the celestial coordinates, as we increase the
declination, the pixel area also increases.
-For a comparison, please run the Fits program with the
@option{--pixelareaonwcs} option on the originally downloaded
@file{jplus-1050345.fits.fz} to see the distortion pattern of the camera's
optics which domintes there.
-We can now use Arithmetic to set the areas of all the pixels that were NaN in
@file{deep.fits} and sum all the values to get an accurate estimate of the area
we get from this dither pattern:
+Therefore, in order to be able to compare the SDSS and J-PLUS images, we
should first subtract the sky from the J-PLUS image.
+To do that, we can either subtract the @code{BACKGROUND_MODEL} HDU from the
@code{IMAGE} HDU using @ref{Arithmetic}, or we can use @ref{NoiseChisel} to
find a good sky ourselves.
+As scientists we like to tweak and be creative, so let's estimate it ourselves!
+Also, in some cases, you may not have a pre-estimated Sky estimation, so you
should be prepared:
@example
-$ astarithmetic deep-pix-area.fits deep.fits isblank nan where -g1 \
- sumvalue --quiet
-2.57318473063151e+00
+$ astnoisechisel jplus-crop.fits --output=jplus-nc.fits
+$ astscript-fits-view jplus-nc.fits
@end example
-As expected, this is very close to the simple multiplication that we did above.
-But it will allow us to accurately estimate the size of the deep field with
any ditter pattern.
+Notice that there is a relatively bright star in the center-bottom of the
image.
+In the ``Cube'' window, click on the ``Next'' button to see the
@code{DETECTIONS} HDU.
+The large footprint of the bright star is obvious.
+Press the ``Next'' button one more time to get to the @code{SKY} HDU.
+You see that in the center-bottom, the footprint of the large star is clearly
visible in the measured Sky level.
+This is not good!
+With Sky values above 54 ADU in the center of the star (the white pixels).
+This over-subtracted Sky level in part of the image will affect your magnitude
measurements and thus the zero point!
-As mentioned above, M94 is about half a degree in diameter; so let's set
@code{step_arcmin=15}.
-This is one quarter of a degree and will put the center of the four exposures
on the four corners of the M94's main ring.
-You are now ready to repeat the commands above with this changed value.
+In @ref{General program usage tutorial}, we have a section on @ref{NoiseChisel
optimization for detection}, there is also a full tutorial on this in
@ref{Detecting large extended targets}.
+Therefore, we will not go into the details of NoiseChisel optimization here.
+Given the large images of J-PLUS, we will increase the tile-size to
@mymath{100\times100} pixels and the number of neighbors to identify outlying
tiles to 50 (these are usually the first parameters you should start editing
when you are confronted with a new image).
+After the second command, check the @code{SKY} extension to confirm that there
is no footprint of any bright object there.
+You will still see a gradient, but note the minimum and maximum values of the
Sky level: their difference is more than 26 times smaller than the noise
standard deviation (so statistically speaking, it is pretty flat!)
-@cartouche
-@noindent
-@strong{Writing scripts:}
-It is better to write the steps above as a script so you can easily change the
basic settings and see the output fast.
-For more on writing scripts, see as described in @ref{Writing scripts to
automate the steps}.
-@end cartouche
+@example
+$ astnoisechisel jplus-crop.fits --output=jplus-nc.fits \
+ --tilesize=100,100 --outliernumngb=50
+$ astscript-fits-view jplus-nc.fits
-After you run the commands above with this single change, you will get a total
area of 2.1597 degrees squared.
-This is just roughly @mymath{15\%} smaller than the previous area; but it is
much more easier to calibrate.
-However, since each pointing's center will fall on one edge of the galaxy, M94
will be present in all the exposures while doing the calibrations.
-We already see a large ring around this galaxy, and when we do a low surface
brightness optimized reduction, there is a chance that the size of the galaxy
is much larger.
-Ideally, you want your target to be on the four edges/corners of each image.
-This will make sure that a large fraction of each exposure will not be covered
by your final target, allowing you to clibrate much more accurately.
-Let's try setting @code{step_arcmin=40} (almost half the width of the
detector).
-You will notice that the area is now 0.05013 degrees squared!
-This is 51 times smaller!
+## Check that the gradient in the sky is statistically negligible.
+$ aststatistics jplus-nc.fits -hSKY --minimum --maximum \
+ | awk '@{print $2-$1@}'
+0.32809
+$ aststatistics jplus-nc.fits -hSKY_STD --median
+8.377977e+00
+@end example
-Take a look at @file{deep.fits}, and you will see that it is a horizontally
elongated rectangle!
-To see the cause, have a look at the @file{stack.fits}: the part with 5
exposures is now very small; covered by a cross-like pattern (which is thicker
along the horizontal) that is four exposures deep and even a larger square-like
region which is three exposures deep.
+We are now ready to find the zero point!
+First, let's run the @command{astscript-zeropoint} with @option{--help} to see
the option names (recall that you can see more details of each option in
@ref{Invoking astscript-zeropoint}).
+For the first time, let's use the script in the most simple state possible.
+We will keep only the essential options: the names of the input and reference
images (and their HDUs), the name of the output, and also two apertures with
radii of 3 arcsec to start with:
-The difference between 3 exposures and 5 exposures seems a lot at first.
-But let's calculate how much it actually affects the achieved signal-to-noise
ratio and the surface brightness limit (for more, see @ref{Quantifying
measurement limits}).
-The surface brightness limit (or upper-limit surface brightness) are both
calculated by applying the definition of magnitude to the standard deviation of
the background.
-So let's calculate how much this difference in depth affects the sky standard
deviation.
+@example
+$ astscript-zeropoint --help
+$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
+ --refimgs=sdss1.fits,sdss2.fits \
+ --output=jplus-zeropoint.fits \
+ --refimgszp=22.5,22.5 \
+ --refimgshdu=0,0 \
+ --aperarcsec=3
+@end example
-Deep images will usually be dominated by @ref{Photon counting noise} (or
Poisson noise).
-Therefore, if a single exposure image has a sky standard deviation of
@mymath{\sigma_s}, and we combine @mymath{N} such exposures, the sky standard
deviation on the stack will be @mymath{\sigma_s/\sqrt{N}}.
-As a result, the surface brightness limit between the regions with @mymath{N}
exposures and @mymath{M} exposures differs by @mymath{2.5\times
log_{10}(\sqrt{N/M}) = 1.25\times log_{10}(N/M)} magnitudes.
-If we set @mymath{N=3} and @mymath{M=5}, we get a surface brightness magnitude
difference of 0.27!
+The output is a FITS table (because generally, you will give more apertures
and choose the best one based on a higher-level analysis).
+Let's check the output's internal structure with Gnuastro's @command{astfits}
program.
-This is a very small difference (given all the other sources of error that
will be present).
-Let's see how much we increase our stack area if we set @code{deep_thresh=3}.
-The newly calculated area is 2.6706 degrees squared!
-This is just slightly larger than the first trial (with @code{step_arcmin=1})!
-Therefore at the cost of decreasing our surface brightness limit by 0.27
magnitudes, we are now able to perfectly calibrate the individual exposures,
and even cover a larger area!
+@example
+$ astfits jplus-zeropoint.fits
+-----
+0 n/a no-data 0 n/a
+1 ZEROPOINTS table_binary 1x3 n/a
+2 APER-3 table_binary 321x2 n/a
+@end example
-@cartouche
-@noindent
-@strong{Calibration is very important:} Better calibration can result in a
fainter surface brightness limit than more exposures with poor calibration;
especially for very low surface brightness signal that covers a large area and
is systematically affected by calibrationn issues.
-@end cartouche
+You can see that there are two HDUs in this file.
+The HDU names give a hint, so let's have a look at each extension with
Gnuastro's @command{asttable} program:
-Based on the argument above, let's define our deep region to be the pixels
with 3 or more exposures.
-Now, let's lave a look at the horizontally stretched cross that we see for the
regions with 4 exposures.
-The reason that the vertical component is thicker is that the same change in
RA and Dec (defined on a curved sphere) will result in different numbers of
pixels on this flat image pixel grid.
+@example
+$ asttable jplus-zeropoint.fits --hdu=1 -i
+--------
+jplus-zeropoint.fits (hdu: 1)
+------- ----- ---- -------
+No.Name Units Type Comment
+------- ----- ---- -------
+1 APERTURE arcsec float32 n/a
+2 ZEROPOINT mag float32 n/a
+3 ZPSTD mag float32 n/a
+--------
+Number of rows: 1
+--------
+@end example
-To have the same size in both, we should divide the RA step by the cosine of
the declination.
-In the command below, we have shown the relevant changes in the dither table
construction above.
+@noindent
+As you can see, in the first extension, for each of the apertures you
requested (@code{APERTURE}), there is a zero point (@code{ZEROPOINT}) and the
standard deviation of the measurements on the apertures (@code{ZPSTD}).
+In this case, we only requested one aperture, so it only has one row.
+Now, let's have a look at the next extension:
@example
-$ echo $center_ra $center_dec \
- | awk '@{s='$step_arcmin'/60; fmt="%-10.6f %-10.6f\n"; \
- pi=atan2(0, -1); r=pi/180; \
- printf fmt, $1, $2; \
- printf fmt, $1+(s/cos($2*r)), $2; \
- printf fmt, $1, $2+s; \
- printf fmt, $1-(s/cos($2*r)), $2; \
- printf fmt, $1, $2-s@}' \
- >> dither.txt
+$ asttable jplus-zeropoint.fits --hdu=2 -i
+--------
+jplus-zeropoint.fits (hdu: 2)
+------- ----- ---- -------
+No.Name Units Type Comment
+------- ----- ---- -------
+1 MAG-REF f32 float32 Magnitude of reference.
+2 MAG-DIFF f32 float32 Magnitude diff with input.
+--------
+Number of rows: 321
+--------
@end example
-@noindent
-Here are two important points to consider when comparing the previous AWK
command with this one:
-@itemize
-@item
-The cosine function of AWK (@code{cos}) assumes that the input is in radians,
not degrees.
-We therefore have to multiply each declination (in degrees) by a variable
@code{r} that contains the conversion factor (@mymath{\pi/180}).
-@item
-AWK doesn't have the value of @mymath{\pi} in memory.
-We need to calculate it, and to do that, we use the @code{atan2} function (as
recommended in the AWK manual, for its definition in Gnuastro, see
@ref{Trigonometric and hyperbolic operators}).
-@end itemize
+It contains a table of measurements for the aperture with the least scatter.
+In this case, we only gave one aperture, so it is the same.
+If you give multiple apertures, only the one with least scatter will be
present by default.
+In the @code{MAG-REF} column you see the magnitudes within each aperture on
the reference (SDSS) image(s).
+The @code{MAG-DIFF} column contains the difference of the input (J-PLUS) and
reference (SDSS) magnitudes for each aperture (see @ref{Zero point estimation}).
+The two catalogs, created by the aperture photometry from the SDSS images, are
merged into one so that there are more stars to compare.
+Therefore, no matter how many reference images you provide, there will only be
a single table here.
+If the two SDSS images overlapped, each object in the overlap region would
have two rows (one row for the measurement from one SDSS image, and another
from the measurement from the other).
-Please use the new AWK command above in your script of the steps above, run it
with everything else unchanged.
-Afterwards, open @file{deep.fits}.
-You will see that the widths of both the horizontal and vertical regions are
the same.
+Now that we have obtained the zero point of the J-PLUS image, let's go a
little deeper into lower-level details of how this script operates.
+This will help you better understand what happened and how to interpret and
improve the outputs when you are confronted with a new image and strange
outputs.
-@cartouche
-@noindent
-@strong{RA and Dec should be treated differently:} As shown above, when
considering differences between two points in your dither pattern, it is
important to remember that the RA is only defined on the equator of the
celestial sphere.
-So when you shift @mymath{+\delta} degrees parallel to the equator, from a
point that is located in RA and Dec of [@mymath{r}, @mymath{d}], the RA and Dec
of the new point are [@mymath{r+\delta{}/cos(d), d}].
-@end cartouche
+To keep intermediate results the @command{astscript-zeropoint} script keeps
temporary files in a temporary directory and later deletes it (and all the
intermediate products).
+If you like to check the temporary files of the intermediate steps, you can
use @option{--keeptmp} option to not remove them.
-You can try making the cross-like region as thin as possible by slightly
increasing the step size.
-For example, set it to @code{step_arcmin=42}.
-When you open @file{deep.fits}, you will see that the depth across this image
is almost contiguous (which is another positive factor!).
+Let's take a closer look into the contents of each HDU.
+First, we'll use Gnuastro’s @command{asttable} to see the measured zeropoint
for this aperture.
+We are using @option{-Y} to have human-friendly (non-scientific!) numbers
(which are sufficient here) and @option{-O} to also show the metadata of each
column at the start.
-You can construct any complex dither pattern (with more than 5 points) based
on the logic and reasoning above to help extract the most science from the
valuable telescope time that you will be getting.
-Of course, factors like the optimal exposure time are also critical, but is
was beyond the scope of this tutorial.
+@example
+$ asttable jplus-zeropoint.fits -Y -O
+# Column 1: APERTURE [arcsec,f32,] Aperture used.
+# Column 2: ZEROPOINT [mag ,f32,] Zero point (sig-clip median).
+# Column 3: ZPSTD [mag ,f32,] Zero point Standard deviation.
+3.000 26.435 0.057
+@end example
+@noindent
+Now, let's have a look at the first 10 rows of the second (@code{APER-3})
extension.
+From the previous check we did above, we see that it contains 321 rows!
+@example
+$ asttable jplus-zeropoint.fits -Y -O --hdu=APER-3 --head=10
+# Column 1: MAG-REF [f32,f32,] Magnitude of reference.
+# Column 2: MAG-DIFF [f32,f32,] Magnitude diff with input.
+16.461 30.035
+16.243 28.209
+15.427 26.427
+20.064 26.459
+17.334 26.425
+20.518 26.504
+17.100 26.400
+16.919 26.428
+17.654 26.373
+15.392 26.429
+@end example
+But the table above is hard to interpret, so let's plot it.
+To do this, we'll use the same @command{astscript-fits-view} command above
that we used for images.
+It detects if the file has a image or table HDU and will call DS9 or TOPCAT
respectively.
+You can also use any other plotter you like (TOPCAT is not part of Gnuastro),
this script just calls it.
+@example
+$ astscript-fits-view jplus-zeropoint.fits --hdu=APER-3
+@end example
-@node Installation, Common program behavior, Tutorials, Top
-@chapter Installation
+After @code{TOPCAT} opens, you can select the ``Graphics'' menu and then
``Plain plot''.
+This will show a plot with the SDSS (reference image) magnitude on the
horizontal axis and the difference of magnitudes between the the input and
reference (the zero point) on the vertical axis.
-@c This link is put here because the `Quick start' section of the first
-@c chapter is not the most eye-catching part of the manual and some users
-@c were seen to follow this ``Installation'' chapter title in search of the
-@c tarball and fast instructions.
-@cindex Installation
-The latest released version of Gnuastro source code is always available at the
following URL:
+In an ideal world, the zero point should be independent of the magnitude of
the different stars that were used.
+Therefore, this plot should be a horizontal line (with some scatter as we go
to fainter stars).
+But as you can see in the plot, in the real world, this expected behavior is
seen only for stars with magnitudes about 16 to 19 in the reference SDSS images.
+The stars that are brighter than 16 are saturated in one (or both)
surveys@footnote{To learn more about saturated pixels and recognition of the
saturated level of the image, please see @ref{Saturated pixels and Segment's
clumps}}.
+Therefore, they do not have the correct magnitude or mag-diff.
+You can check some of these stars visually by using the blinking command above
and zooming into some of the brighter stars in the SDSS images.
-@url{http://ftpmirror.gnu.org/gnuastro/gnuastro-latest.tar.gz}
+@cindex Depth of data
+On the other hand, it is natural that we cannot measure accurate magnitudes
for the fainter stars because the noise level (or ``depth'') of each image is
limited.
+As a result, the horizontal line becomes wider (scattered) as we go to the
right (fainter magnitudes on the horizontal axis).
+So, let's limit the range of used magnitudes from the SDSS catalog to
calculate a more accurate zero point for the J-PLUS image.
+For this reason, we have the @option{--magnituderange} option in
@command{astscript-zeropoint}.
+@cartouche
@noindent
-@ref{Quick start} describes the commands necessary to configure, build, and
install Gnuastro on your system.
-This chapter will be useful in cases where the simple procedure above is not
sufficient, for example, your system lacks a mandatory/optional dependency (in
other words, you cannot pass the @command{$ ./configure} step), or you want
greater customization, or you want to build and install Gnuastro from other
random points in its history, or you want a higher level of control on the
installation.
-Thus if you were happy with downloading the tarball and following @ref{Quick
start}, then you can safely ignore this chapter and come back to it in the
future if you need more customization.
-
-@ref{Dependencies} describes the mandatory, optional and bootstrapping
dependencies of Gnuastro.
-Only the first group are required/mandatory when you are building Gnuastro
using a tarball (see @ref{Release tarball}), they are very basic and low-level
tools used in most astronomical software, so you might already have them
installed, if not they are very easy to install as described for each.
-@ref{Downloading the source} discusses the two methods you can obtain the
source code: as a tarball (a significant snapshot in Gnuastro's history), or
the full history@footnote{@ref{Bootstrapping dependencies} are required if you
clone the full history.}.
-The latter allows you to build Gnuastro at any random point in its history
(for example, to get bug fixes or new features that are not released as a
tarball yet).
+@strong{Necessity of sky subtraction:}
+To obtain this horizontal line, it is very important that both your images
have been sky subtracted.
+Please, repeat the last @command{astscript-zeropoint} command above only by
changing the input file to @file{jplus-crop.fits}.
+Then use Gnuastro’s @command{astscript-fits-view} again to draw a plot with
@code{TOPCAT} (also same as above).
+Instead of a horizontal line, you will see @emph{a sloped line} in the
magnitude range above!
+This happens because the sky level acts as a source of constant signal in all
apertures, so the magnitude difference will not be independent of the star's
magnitude, but dependent on it (the measurement on a fainter star will be
dominated by the sky level).
-The building and installation of Gnuastro is heavily customizable, to learn
more about them, see @ref{Build and install}.
-This section is essentially a thorough explanation of the steps in @ref{Quick
start}.
-It discusses ways you can influence the building and installation.
-If you encounter any problems in the installation process, it is probably
already explained in @ref{Known issues}.
-In @ref{Other useful software} the installation and usage of some other free
software that are not directly required by Gnuastro but might be useful in
conjunction with it is discussed.
+@strong{Remember:} if you see a sloped line instead of a horizontal line, the
input or reference image(s) are not sky subtracted.
+@end cartouche
+Another key parameter of this script is the aperture size
(@option{--aperarcsec}) for the aperture photometry of images.
+On one hand, if the selected aperture is too small, you will be at the mercy
of the differing PSFs between your input and reference image(s): part of the
light of the star will be lost in the image with the worse PSF.
+On the other hand, with large aperture size, the light of neighboring objects
(stars/galaxies) can affect the photometry.
+We should select an aperture radius of the same order than the one used in the
reference image, typically 2 to 3 times the PSF FWHM of the images.
+For now, let's assume the values 2, 3, 4, 5, and 6 arcsec for the aperture
sizes parameter.
+The script will compare the result for several aperture sizes and choose the
one with least standard deviation value, @code{ZPSTD} column of the
@code{ZEROPOINTS} HDU.
-@menu
-* Dependencies:: Necessary packages for Gnuastro.
-* Downloading the source:: Ways to download the source code.
-* Build and install:: Configure, build and install Gnuastro.
-@end menu
+Let's re-run the script with the following changes:
+@itemize
+@item
+Using @option{--magnituderange} to limit the stars used for estimating the
zero point.
+@item
+Giving more values for aperture size to find the best for these two images as
explained above.
+@item
+Call @option{--keepzpap} option to keep the result of matching the catalogs
done with the selected apertures in the different extensions of the output file.
+@end itemize
+@example
+$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
+ --refimgs=sdss1.fits,sdss2.fits \
+ --output=jplus-zeropoint.fits \
+ --refimgszp=22.5,22.5 \
+ --aperarcsec=2,3,4,5,6 \
+ --magnituderange=16,18 \
+ --refimgshdu=0,0 \
+ --keepzpap
+@end example
+Now, check number of HDU extensions by @command{astfits}.
+@example
+$ astfits jplus-zeropoint.fits
+-----
+0 n/a no-data 0 n/a
+1 ZEROPOINTS table_binary 5x3 n/a
+2 APER-2 table_binary 319x2 n/a
+3 APER-3 table_binary 321x2 n/a
+4 APER-4 table_binary 323x2 n/a
+5 APER-5 table_binary 323x2 n/a
+6 APER-6 table_binary 325x2 n/a
+@end example
-@node Dependencies, Downloading the source, Installation, Installation
-@section Dependencies
-
-A minimal set of dependencies are mandatory for building Gnuastro from the
standard tarball release.
-If they are not present you cannot pass Gnuastro's configuration step.
-The mandatory dependencies are therefore very basic (low-level) tools which
are easy to obtain, build and install, see @ref{Mandatory dependencies} for a
full discussion.
-
-If you have the packages of @ref{Optional dependencies}, Gnuastro will have
additional functionality (for example, converting FITS images to JPEG or PDF).
-If you are installing from a tarball as explained in @ref{Quick start}, you
can stop reading after this section.
-If you are cloning the version controlled source (see @ref{Version controlled
source}), an additional bootstrapping step is required before configuration and
its dependencies are explained in @ref{Bootstrapping dependencies}.
-
-Your operating system's package manager is an easy and convenient way to
download and install the dependencies that are already pre-built for your
operating system.
-In @ref{Dependencies from package managers}, we will list some common
operating system package manager commands to install the optional and mandatory
dependencies.
+You can see that the output file now has a separate HDU for each aperture
(thanks to @option{--keepzpap}.)
+The @code{ZEROPOINTS} hdu contains the final zero point values for each
aperture and their error.
+The best zero point value belongs to the aperture that has the least scatter
(has the lowest standard deviation).
+The rest of extensions contain the zero point value computed within each
aperture (as discussed above).
-@menu
-* Mandatory dependencies:: Gnuastro will not install without these.
-* Optional dependencies:: Adding more functionality.
-* Bootstrapping dependencies:: If you have the version controlled source.
-* Dependencies from package managers:: Installing from OS package managers.
-@end menu
+Let's check the different tables by plotting all magnitude tables at the same
time with @code{TOPCAT}.
-@node Mandatory dependencies, Optional dependencies, Dependencies, Dependencies
-@subsection Mandatory dependencies
+@example
+$ astscript-fits-view jplus-zeropoint.fits
+@end example
-@cindex Dependencies, Gnuastro
-@cindex GNU build system
-The mandatory Gnuastro dependencies are very basic and low-level tools.
-They all follow the same basic GNU based build system (like that shown in
@ref{Quick start}), so even if you do not have them, installing them should be
pretty straightforward.
-In this section we explain each program and any specific note that might be
necessary in the installation.
+@noindent
+After @code{TOPCAT} has opened take the following steps:
+@enumerate
+@item
+From the ``Graphics'' menu, select ``Plain plot''.
+You will see the last HDU's scatter plot open in a new window (for
@code{APER-6}, with red points).
+The Bottom-left panel has the logo of a red-blue scatter plot that has written
@code{6:jplus-zeropoint.fits} in front of it (showing that this is the 6th HDU
of this file).
+In the bottom-right panel, you see the names of the columns that are being
displayed.
+@item
+In the ``Layers'' menu, Click on ``Add Position Control''.
+On the bottom-left panel, you will notice that a new blue-red scatter plot has
appeared but it just says @code{<no table>}.
+In the bottom-right panel, in front of ``Table:'', select any other extension.
+This will plot the same two columns of that extension as blue points.
+Zoom-in to the region of the horizontal line to see/compare the different
scatters.
+Change the HDU given to ``Table:'' and see the distribution of zero points for
the different apertures.
+@end enumerate
-@menu
-* GNU Scientific Library:: Installing GSL.
-* CFITSIO:: C interface to the FITS standard.
-* WCSLIB:: C interface to the WCS standard of FITS.
-@end menu
+The manual/visual operation above is critical if this is your first time with
a new dataset (it shows all kinds of systematic biases (like the Sky issue
above)!
+But once you know your data has no systematic biases, choosing between the
different apertures is not easy visually!
+Let's have a look at the table the @code{ZEROPOINTS} HDU (we don't need to
explicitly call this HDU since it is the first one):
-@node GNU Scientific Library, CFITSIO, Mandatory dependencies, Mandatory
dependencies
-@subsubsection GNU Scientific Library
+@example
+$ asttable jplus-zeropoint.fits -O -Y
+# Column 1: APERTURE [arcsec,f32,] Aperture used.
+# Column 2: ZEROPOINT [mag ,f32,] Zero point (sig-clip median).
+# Column 3: ZPSTD [mag ,f32,] Zero point Standard deviation.
+2.000 26.405 0.028
+3.000 26.436 0.030
+4.000 26.448 0.035
+5.000 26.458 0.042
+6.000 26.466 0.056
+@end example
-@cindex GNU Scientific Library
-The @url{http://www.gnu.org/software/gsl/, GNU Scientific Library}, or GSL, is
a large collection of functions that are very useful in scientific
applications, for example, integration, random number generation, and Fast
Fourier Transform among many others.
-To download and install GSL from source, you can run the following commands.
+The most accurate zero point is the one where @code{ZPSTD} is the smallest.
+In this case, minimum of @code{ZPSTD} is with radii of 2 and 3 arcseconds.
+Run the @command{astscript-fits-view} command above again to open TOPCAT.
+Let's focus on the magnitude plots in these two apertures and determine a more
accurate range of magnitude.
+The more reliable option is the range between 16.4 (where we have no saturated
stars) and 18.5 mag (fainter than this, the scatter becomes too strong).
+Finally, let's set some more apertures between 2 and 3 arcseconds radius:
@example
-$ wget http://ftpmirror.gnu.org/gsl/gsl-latest.tar.gz
-$ tar xf gsl-latest.tar.gz
-$ cd gsl-X.X # Replace X.X with version number.
-$ ./configure CFLAGS="$CFLAGS -g0 -O3"
-$ make -j8 # Replace 8 with no. CPU threads.
-$ make check
-$ sudo make install
+$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
+ --refimgs=sdss1.fits,sdss2.fits \
+ --output=jplus-zeropoint.fits \
+ --magnituderange=16.4,18.5 \
+ --refimgszp=22.5,22.5 \
+ --aperarcsec=2,2.5,3,3.5,4 \
+ --refimgshdu=0,0 \
+ --keepzpap
+
+$ asttable jplus-zeropoint.fits -Y
+2.000 26.405 0.037
+2.500 26.425 0.033
+3.000 26.436 0.034
+3.500 26.442 0.039
+4.000 26.449 0.044
@end example
-@node CFITSIO, WCSLIB, GNU Scientific Library, Mandatory dependencies
-@subsubsection CFITSIO
+The aperture with the least scatter is therefore the 2.5 arcsec radius
aperture, giving a zero point of 26.425 magnitudes for this image.
+However, you can see that the scatter for the 3 arcsec aperture is also
acceptable.
+Actually, the @code{ZPSTD} for of the 2.5 and 3 arcsec apertures only have a
difference of @mymath{3\%} (@mymath{= (0.034−0.0333)/0.033\times100}).
+So simply choosing the minimum is just a first-order approximation (which is
accurate within @mymath{26.436−26.425=0.011} magnitudes)
-@cindex CFITSIO
-@cindex FITS standard
-@url{http://heasarc.gsfc.nasa.gov/fitsio/, CFITSIO} is the closest you can get
to the pixels in a FITS image while remaining faithful to the
@url{http://fits.gsfc.nasa.gov/fits_standard.html, FITS standard}.
-It is written by William Pence, the principal author of the FITS
standard@footnote{Pence, W.D. et al. Definition of the Flexible Image Transport
System (FITS), version 3.0. (2010) Astronomy and Astrophysics, Volume 524,
id.A42, 40 pp.}, and is regularly updated.
-Setting the definitions for all other software packages using FITS images.
+Note that in aperture photometry, the PSF plays an important role (because the
aperture is fixed but the two images can have very different PSFs).
+The aperture with the least scatter should also account for the differing PSFs.
+Overall, please, always check the different and intermediate steps to make
sure the parameters are the good so the estimation of the zero point is correct.
-@vindex --enable-reentrant
-@cindex Reentrancy, multiple file opening
-@cindex Multiple file opening, reentrancy
-Some GNU/Linux distributions have CFITSIO in their package managers, if it is
available and updated, you can use it.
-One problem that might occur is that CFITSIO might not be configured with the
@option{--enable-reentrant} option by the distribution.
-This option allows CFITSIO to open a file in multiple threads, it can thus
provide great speed improvements.
-If CFITSIO was not configured with this option, any program which needs this
capability will warn you and abort when you ask for multiple threads (see
@ref{Multi-threaded operations}).
+If you are happy with the minimum, you don't have to search for the minimum
aperture or its corresponding zero point yourself.
+This script has written it in @code{ZPVALUE} keyword of the table.
+With the first command, we also see the name of the file also, (you can use
this on many files for example).
+With the second command, we are only printing the number by adding the
@option{-q} (or @option{--quiet}) option (this is useful in a script where you
want to write the value in a shell variable to use later).
-To install CFITSIO from source, we strongly recommend that you have a look
through Chapter 2 (Creating the CFITSIO library) of the CFITSIO manual and
understand the options you can pass to @command{$ ./configure} (they are not
too much).
-This is a very basic package for most astronomical software and it is best
that you configure it nicely with your system.
-Once you download the source and unpack it, the following configure script
should be enough for most purposes.
-Do Not forget to read chapter two of the manual though, for example, the
second option is only for 64bit systems.
-The manual also explains how to check if it has been installed correctly.
+@example
+$ astfits jplus-zeropoint.fits --keyvalue=ZPVALUE
+jplus-zeropoint.fits 2.642512e+01
-CFITSIO comes with two executable files called @command{fpack} and
@command{funpack}.
-From their manual: they ``are standalone programs for compressing and
uncompressing images and tables that are stored in the FITS (Flexible Image
Transport System) data format.
-They are analogous to the gzip and gunzip compression programs except that
they are optimized for the types of astronomical images that are often stored
in FITS format''.
-The commands below will compile and install them on your system along with
CFITSIO.
-They are not essential for Gnuastro, since they are just wrappers for
functions within CFITSIO, but they can come in handy.
-The @command{make utils} command is only available for versions above 3.39, it
will build these executable files along with several other executable test
files which are deleted in the following commands before the installation
(otherwise the test files will also be installed).
+$ astfits jplus-zeropoint.fits --keyvalue=ZPVALUE -q
+2.642512e+01
+@end example
-The commands necessary to download the source, decompress, build and install
CFITSIO from source are described below.
+Generally, this script will write the following FITS keywords (all starting
with @code{ZP}) for your future reference in its output:
@example
-$ urlbase=http://heasarc.gsfc.nasa.gov/FTP/software/fitsio/c
-$ wget $urlbase/cfitsio_latest.tar.gz
-$ tar xf cfitsio_latest.tar.gz
-$ cd cfitsio-X.XX # Replace X.XX with version
-$ ./configure --prefix=/usr/local --enable-sse2 --enable-reentrant \
- CFLAGS="$CFLAGS -g0 -O3"
-$ make
-$ make utils
-$ ./testprog > testprog.lis # See below if this has an error
-$ diff testprog.lis testprog.out # Should have no output
-$ cmp testprog.fit testprog.std # Should have no output
-$ rm cookbook fitscopy imcopy smem speed testprog
-$ sudo make install
+$ astfits jplus-zeropoint.fits -h1 | grep ^ZP
+ZPAPER = 2.5 / Best aperture.
+ZPVALUE = 26.42512 / Best zero point.
+ZPSTD = 0.03276644 / Best std. dev. of zeropoint.
+ZPMAGMIN= 16.4 / Min mag for obtaining zeropoint.
+ZPMAGMAX= 18.5 / Max mag for obtaining zeropoint.
@end example
-In the @code{./testprog > testprog.lis} step, you may confront an error,
complaining that it cannot find @file{libcfitsio.so.AAA} (where @code{AAA} is
an integer).
-This is the library that you just built and have not yet installed.
-But unfortunately some versions of CFITSIO do not account for this on some OSs.
-To fix the problem, you need to tell your OS to also look into current CFITSIO
build directory with the first command below, afterwards, the problematic
command (second below) should run properly.
+Using the @option{--keyvalue} option of the @ref{Fits} program, you can easily
get multiple of the values in one run (where necessary):
@example
-$ export LD_LIBRARY_PATH="$(pwd):$LD_LIBRARY_PATH"
-$ ./testprog > testprog.lis
+$ astfits jplus-zeropoint.fits --hdu=1 --quiet \
+ --keyvalue=ZPAPER,ZPVALUE,ZPSTD
+2.500000e+00 2.642512e+01 3.276644e-02
@end example
-Recall that the modification above is ONLY NECESSARY FOR THIS STEP.
-@emph{Do Not} put the @code{LD_LIBRARY_PATH} modification command in a
permanent place (like your bash startup file).
-After installing CFITSIO, close your terminal and continue working on a new
terminal (so @code{LD_LIBRARY_PATH} has its default value).
-For more on @code{LD_LIBRARY_PATH}, see @ref{Installation directory}.
-
+@node Zero point tutorial with reference catalog, , Zero point tutorial with
reference image, Zero point of an image
+@subsection Zero point tutorial with reference catalog
+In @ref{Zero point tutorial with reference image}, we explained how to use the
@command{astscript-zeropoint} for estimating the zero point of one image based
on a reference image.
+Sometimes there is not a reference image and we need to use a reference
catalog.
+Fortunately, @command{astscript-zeropoint} can also use the catalog instead of
the image to find the zero point.
+To show this, let's download a catalog of SDSS in the area that overlaps with
the cropped J-PLUS image (used in the previous section).
+For more on Gnuastro's Query program, please see @ref{Query}.
+The columns of ID, RA, Dec and magnitude in the SDSS @emph{r} filter are
called by their name in the SDSS catalog.
+@example
+$ astquery vizier \
+ --dataset=sdss12 \
+ --overlapwith=jplus-crop.fits \
+ --column=objID,RA_ICRS,DE_ICRS,rmag \
+ --output=sdss-catalog.fits
+@end example
-@node WCSLIB, , CFITSIO, Mandatory dependencies
-@subsubsection WCSLIB
+To visualize the position of the SDSS objects over the J-PLUS image, let's use
@command{astscript-ds9-region} (for more details please see @ref{SAO DS9 region
files from table}) with the command below (it will automatically open DS9 and
load the regions it created):
-@cindex WCS
-@cindex WCSLIB
-@cindex World Coordinate System
-@url{http://www.atnf.csiro.au/people/mcalabre/WCS/, WCSLIB} is written and
maintained by one of the authors of the World Coordinate System (WCS)
definition in the @url{http://fits.gsfc.nasa.gov/fits_standard.html, FITS
standard}@footnote{Greisen E.W., Calabretta M.R. (2002) Representation of world
coordinates in FITS.
-Astronomy and Astrophysics, 395, 1061-1075.}, Mark Calabretta.
-It might be already built and ready in your distribution's package management
system.
-However, here the installation from source is explained, for the advantages of
installation from source please see @ref{Mandatory dependencies}.
-To install WCSLIB you will need to have CFITSIO already installed, see
@ref{CFITSIO}.
+@example
+$ astscript-ds9-region sdss-catalog.fits \
+ --column=RA_ICRS,DE_ICRS \
+ --color=red --width=3 --output=sdss.reg \
+ --command="ds9 jplus-nc.fits[INPUT-NO-SKY] \
+ -scale zscale"
+@end example
-@vindex --without-pgplot
-WCSLIB also has plotting capabilities which use PGPLOT (a plotting library for
C).
-If you wan to use those capabilities in WCSLIB, @ref{PGPLOT} provides the
PGPLOT installation instructions.
-However PGPLOT is old@footnote{As of early June 2016, its most recent version
was uploaded in February 2001.}, so its installation is not easy, there are
also many great modern WCS plotting tools (mostly in written in Python).
-Hence, if you will not be using those plotting functions in WCSLIB, you can
configure it with the @option{--without-pgplot} option as shown below.
+Now, we are ready to estimate the zero point of the J-PLUS image based on the
SDSS catalog.
+To download the input image and understand how to use the
@command{astscript-zeropoint}, please see @ref{Zero point tutorial with
reference image}.
-If you have the cURL library @footnote{@url{https://curl.haxx.se}} on your
system and you installed CFITSIO version 3.42 or later, you will need to also
link with the cURL library at configure time (through the @code{-lcurl} option
as shown below).
-CFITSIO uses the cURL library for its HTTPS (or HTTP
Secure@footnote{@url{https://en.wikipedia.org/wiki/HTTPS}}) support and if it
is present on your system, CFITSIO will depend on it.
-Therefore, if @command{./configure} command below fails (you do not have the
cURL library), then remove this option and rerun it.
+Many of the options (like the aperture size) and magnitude range are the same
so we will not discuss them further.
+You will notice that the only substantive difference of the command below with
the last command in the previous section is that we are using @option{--refcat}
instead of @option{--refimgs}.
+There are also some cosmetic differences for example a new output name, not
using @option{--refimgszp} since it is only necessary for images) and the
@option{--*column} options which are used to identify the names of the
necessary columns of the input catalog:
-To download, configure, build, check and install WCSLIB from source, you can
follow the steps below.
@example
-## Download and unpack the source tarball
-$ wget ftp://ftp.atnf.csiro.au/pub/software/wcslib/wcslib.tar.bz2
-$ tar xf wcslib.tar.bz2
+$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
+ --refcat=sdss-catalog.fits \
+ --refcatmag=rmag \
+ --refcatra=RA_ICRS \
+ --refcatdec=DE_ICRS \
+ --output=jplus-zeropoint-cat.fits \
+ --magnituderange=16.4,18.5 \
+ --aperarcsec=2,2.5,3,3.5,4 \
+ --keepzpap
+@end example
-## In the `cd' command, replace `X.X' with version number.
-$ cd wcslib-X.X
+@noindent
+Let's inspect the output with the command below.
-## If `./configure' fails, remove `-lcurl' and run again.
-$ ./configure LIBS="-pthread -lcurl -lm" --without-pgplot \
- --disable-fortran CFLAGS="$CFLAGS -g0 -O3"
-$ make
-$ make check
-$ sudo make install
+@example
+$ asttable jplus-zeropoint-cat.fits -Y
+2.000 26.337 0.034
+2.500 26.386 0.036
+3.000 26.417 0.041
+3.500 26.439 0.043
+4.000 26.455 0.050
@end example
+As you see, the values and standard deviations are very similar to the results
we got previously in @ref{Zero point tutorial with reference image}.
+The Standard deviations are generally a little higher here because we didn't
do the photometry ourselves, but they are statistically similar.
+Before we finish, let's open the two outputs (from a reference image and
reference catalog) with the command below.
+To confirm how they compare, we are showing the result for @code{APER-3}
extension in both (following the TOPCAT plotting recipe in @ref{Zero point
tutorial with reference image}).
-@node Optional dependencies, Bootstrapping dependencies, Mandatory
dependencies, Dependencies
-@subsection Optional dependencies
-
-The libraries listed here are only used for very specific applications,
therefore they are optional and Gnuastro can be built without them (with only
those specific features disabled).
-Since these are pretty low-level tools, they are not too hard to install from
source, but you can also use your operating system's package manager to easily
install all of them.
-For more, see @ref{Dependencies from package managers}.
-
-@cindex GPL Ghostscript
-If the @command{./configure} script cannot find any of these optional
dependencies, it will notify you of the operation(s) you cannot do due to not
having them.
-If you continue the build and request an operation that uses a missing
library, Gnuastro's programs will warn that the optional library was missing at
build-time and abort.
-Since Gnuastro was built without that library, installing the library
afterwards will not help.
-The only way is to re-build Gnuastro from scratch (after the library has been
installed).
-However, for program dependencies (like cURL or Ghostscript) things are
easier: you can install them after building Gnuastro also.
-This is because libraries are used to build the internal structure of
Gnuastro's executables.
-However, a program dependency is called by Gnuastro's programs at run-time and
has no effect on their internal structure.
-So if a dependency program becomes available later, it will be used next time
it is requested.
-
-@table @asis
-
-@item GNU Libtool
-@cindex GNU Libtool
-Libtool is a program to simplify managing of the libraries to build an
executable (a program).
-GNU Libtool has some added functionality compared to other implementations.
-If GNU Libtool is not present on your system at configuration time, a warning
will be printed and @ref{BuildProgram} will not be built or installed.
-The configure script will look into your search path (@code{PATH}) for GNU
Libtool through the following executable names: @command{libtool} (acceptable
only if it is the GNU implementation) or @command{glibtool}.
-See @ref{Installation directory} for more on @code{PATH}.
-
-GNU Libtool (the binary/executable file) is a low-level program that is
probably already present on your system, and if not, is available in your
operating system package manager@footnote{Note that we want the
binary/executable Libtool program which can be run on the command-line.
-In Debian-based operating systems which separate various parts of a package,
you want want @code{libtool-bin}, the @code{libtool} package will not contain
the executable program.}.
-If you want to install GNU Libtool's latest version from source, please visit
its @url{https://www.gnu.org/software/libtool/, web page}.
+@example
+$ astscript-fits-view jplus-zeropoint.fits jplus-zeropoint-cat.fits \
+ -hAPER-3
+@end example
-Gnuastro's tarball is shipped with an internal implementation of GNU Libtool.
-Even if you have GNU Libtool, Gnuastro's internal implementation is used for
the building and installation of Gnuastro.
-As a result, you can still build, install and use Gnuastro even if you do not
have GNU Libtool installed on your system.
-However, this internal Libtool does not get installed.
-Therefore, after Gnuastro's installation, if you want to use
@ref{BuildProgram} to compile and link your own C source code which uses the
@ref{Gnuastro library}, you need to have GNU Libtool available on your system
(independent of Gnuastro).
-See @ref{Review of library fundamentals} to learn more about libraries.
-@item GNU Make extension headers
-@cindex GNU Make
-GNU Make is a workflow management system that can be used to run a series of
commands in a specific order, and in parallel if you want.
-GNU Make offers special features to extend it with custom functions within a
dynamic library.
-They are defined in the @file{gnumake.h} header.
-If @file{gnumake.h} can be found on your system at configuration time,
Gnuastro will build a custom library that GNU Make can use for extended
functionality in (astronomical) data analysis scenarios.
+@node Dither pattern design, , Zero point of an image, Tutorials
+@section Dither pattern design
-@item libgit2
-@cindex Git
-@pindex libgit2
-@cindex Version control systems
-Git is one of the most common version control systems (see @ref{Version
controlled source}).
-When @file{libgit2} is present, and Gnuastro's programs are run within a
version controlled directory, outputs will contain the version number of the
working directory's repository for future reproducibility.
-See the @command{COMMIT} keyword header in @ref{Output FITS files} for a
discussion.
+@cindex Dithering
+Deciding a suitable dithering pattern is one of the most important steps when
planning your observation strategy.
+Dithering is the process of moving each exposure compared to the previous one
and is done for several reasons like increasing resolution, expending the area
of the observation and etc.
+For a more complete introduction to dithering, see @ref{Dithering pattern
simulation}.
+Gnuastro has a script (@command{astscript-dither-simulate}) for simplifying
the process of choosing the best dither pattern to optimizing your observation
strategy for your scientific goal.
-@item libjpeg
-@pindex libjpeg
-@cindex JPEG format
-libjpeg is only used by ConvertType to read from and write to JPEG images, see
@ref{Recognized file formats}.
-@url{http://www.ijg.org/, libjpeg} is a very basic library that provides tools
to read and write JPEG images, most Unix-like graphic programs and libraries
use it.
-Therefore you most probably already have it installed.
-@url{http://libjpeg-turbo.virtualgl.org/, libjpeg-turbo} is an alternative to
libjpeg.
-It uses Single instruction, multiple data (SIMD) instructions for ARM based
systems that significantly decreases the processing time of JPEG compression
and decompression algorithms.
+In this tutorial, let's assume you want to observe
@url{https://en.wikipedia.org/wiki/Messier_94, M94} in the H-alpha and rSDSS
filters@footnote{For the full list of available filters, see the
@url{https://oaj.cefca.es/telescopes/t80cam, T80Cam description}.} (to study
the extended star formation in the outer rings of this beautiful galaxy!).
+Including the outer parts of the rings, the galaxy is half a degree in
diameter!
+This is very large, and you want to design a dither pattern that will cover
this area with the maximum depth!
+Therefore, you need an instrument with a large field of view.
+Let's assume that after some searching, you decide to write a proposal for the
@url{https://oaj.cefca.es/telescopes/jast80, JAST80 telescope} at the
@url{https://oaj.cefca.es, Observatorio Astrofísico de Javalambre},
OAJ@footnote{For full disclosure, Gnuastro is being developed at CEFCA (Centro
de Estudios de F@'isica del Cosmos de Arag@'on); which also hosts OAJ.}, in
Teruel (Spain).
+The field of view of this telescope's camera is almost 1.5 degrees wide,
nicely fitting M94!
-@item libtiff
-@pindex libtiff
-@cindex TIFF format
-libtiff is used by ConvertType and the libraries to read TIFF images, see
@ref{Recognized file formats}.
-@url{http://www.simplesystems.org/libtiff/, libtiff} is a very basic library
that provides tools to read and write TIFF images, most Unix-like operating
system graphic programs and libraries use it.
-Therefore even if you do not have it installed, it must be easily available in
your package manager.
+Before we start, as described in @ref{Dithering pattern simulation}, it is
just important to remember that the ideal dither pattern depends primarily on
your scientific objective, as well as the limitations of the instrument you are
observing with.
+Therefore, there is no single dither pattern for all purposes.
+However, the tools, methods, criteria or logic to check if your dither pattern
satisfies your scientific requirement are similar.
+Therefore, you can use the same methods, tools or logic here to simulate or
verify that your dither pattern will produce the products you expect after the
observation.
+The hypothetical scenario above is just an example to show the usage.
+As with any tutorial, do not blindly follow the same final solution for any
scenario; this is just an example to show you @emph{how to} find your own
solution, not to give you the universal solution to any scenario!
-@item cURL
-@cindex cURL (downloading tool)
-cURL's executable (@command{curl}) is called by @ref{Query} for submitting
queries to remote datasets and retrieving the results.
-It is not necessary for the build of Gnuastro from source (only a warning will
be printed if it cannot be found at configure time), so if you do not have it
at build-time there is no problem.
-Just be sure to have it when you run @command{astquery}, otherwise you'll get
an error about not finding @command{curl}.
+To start simulating a dither pattern for a certain telescope, you just need a
single-exposure image of that telescope with WCS information.
+In other words, after astrometry, but before warping into any other pixel grid
(to combine into a deeper stack).
+The image will give us the default number of the camera's pixels, its pixel
scale (width of pixel in arcseconds) and the camera distortion.
+These are reference parameters that are independent of the position of the
image on the sky.
-@item GPL Ghostscript
-@cindex GPL Ghostscript
-GPL Ghostscript's executable (@command{gs}) is called by ConvertType to
compile a PDF file from a source PostScript file, see @ref{ConvertType}.
-Therefore its headers (and libraries) are not needed.
+Because the actual position of the reference image is irrelevant, let's assume
that in a previous project, persumably on
@url{https://en.wikipedia.org/wiki/NGC_4395, NGC 4395}, you already had the
download command of the following single exposure image:
-@item Python3 with Numpy
-@cindex Numpy
-@cindex Python3
-Python is a high-level programming language and Numpy is the most commonly
used library within Python to add multi-dimensional arrays and matrices.
-If you configure Gnuastro with @option{--with-python} @emph{and} version 3 of
Python is available with a corresponding Numpy Library, Gnuastro's library will
be built with some Python-related helper functions.
-Python wrappers for Gnuastro's library (for example, `pyGnuastro') can use
these functions when being built from source.
-For more on Gnuastro's Python helper functions, see @ref{Python interface}.
+@example
+$ mkdir dither-tutorial
+$ cd dither-tutorial
+$ mkdir input
+$ siapurl=https://archive.cefca.es/catalogues/vo/siap
+$ wget $siapurl/jplus-dr3/reduced/get_fits?id=1050345 \
+ -O input/jplus-1050345.fits.fz
-@cindex PyPI
-This Python interface is only relevant if you want to build the Python
wrappers (like `pyGnuastro') from source.
-If you install the Gnuastro Python wrapper from a pre-built repository like
PyPI, this feature of your Gnuastro library won't be used.
-Pre-built libraries contain the full Gnuastro library that they need within
them (you don't even need to have Gnuastro at all!).
+$ astscript-fits-view input/jplus-1050345.fits.fz
+@end example
@cartouche
@noindent
-@strong{Can't find the Python3 and Numpy of a virtual environment:} make sure
to set the @code{$PYTHON} variable to point to the @code{python3} command of
the virtual environment before running @code{./configure}.
-Note that you don't need to activate the virtual env, just point @code{PYTHON}
to its Python3 executable, like the example below:
-
-@example
-$ python3 -m venv test-env # Setting up the virtual env.
-$ export PYTHON="$(pwd)/test-env/bin/python3"
-$ ./configure # Gnuastro's configure script.
-@end example
+@strong{This is the first time I am using an instrument:} In case you haven't
already used images from your desired instrument (to use as reference), you can
find such images from their public archives; or contacting them.
+A single exposure images is rarely of any scientific value (post-processing
and stacking is necessary to make high-level and science-ready products).
+Therefore, they become publicly available very soon after the observation
date; furthermore, calibration images are usually public immediately.
@end cartouche
-@item SAO DS9
-SAO DS9 (@command{ds9}) is a visualization tool for FITS images.
-Gnuastro's @command{astscript-fits-view} program calls DS9 to visualize FITS
images.
-We have a full appendix on it and how to install it in @ref{SAO DS9}.
-Since it is a run-time dependency, it can be installed at any later time
(after building and installing Gnuastro).
+As you see from the image above, the T80Cam images are large (9216 by 9232
pixels).
+Therefore, to speed up the dither testing, let's down-sample the image above
by a factor of 10.
+This step is optional and you can safely use the full resolution, which will
give you a more precise stack, but which will be much slower (maybe good after
you have an approximate solution on the down-sampled image).
+We will call the output @file{ref.fits} (since it is the ``reference'' for our
test).
+We are putting these two ``input'' files (to the script) in a dedicated
directory to keep the running directory clean (be able to easily delete
temporary/test files for a fresh start with a `@command{rm *.fits}').
-@item TOPCAT
-TOPCAT (@command{topcat}) is a visualization tool for astronomical tables
(most commonly: plotting).
-Gnuastro's @command{astscript-fits-view} program calls TOPCAT it to visualize
tables.
-We have a full appendix on it and how to install it in @ref{TOPCAT}.
-Since it is a run-time dependency, it can be installed at any later time
(after building and installing Gnuastro).
-@end table
+@example
+$ astwarp input/jplus-1050345.fits.fz --scale=1/10 -oinput/ref.fits
+@end example
+For a first trial, let's create a cross-shaped dither pattern around M94
(which is centered at its center on the RA and Dec of 192.721250, 41.120556).
+We'll center one exposure on the center of the galaxy, and include 4 more
exposures that are each 1 arc-minute away along the RA and Dec axes.
+To simplify the actual command later@footnote{Instead of this, later, when you
called @command{astscript-dither-simulate}, you could pass the
@option{--racol=1} and @option{--deccol=2} options.
+But having metadata is always preferred (will avoid many bugs/frustrations in
the long-run!).}, let's also include the column names through two lines of
metadata.
+@example
+$ step_arcmin=1
+$ center_ra=192.721250
+$ center_dec=41.120556
+$ echo "# Column 1: RA [deg, f64] Right Ascension" > dither.txt
+$ echo "# Column 2: Dec [deg, f64] Declination" >> dither.txt
-@node Bootstrapping dependencies, Dependencies from package managers, Optional
dependencies, Dependencies
-@subsection Bootstrapping dependencies
+$ echo $center_ra $center_dec \
+ | awk '@{s='$step_arcmin'/60; fmt="%-10.6f %-10.6f\n"; \
+ printf fmt, $1, $2; \
+ printf fmt, $1+s, $2; \
+ printf fmt, $1, $2+s; \
+ printf fmt, $1-s, $2; \
+ printf fmt, $1, $2-s@}' \
+ >> dither.txt
-Bootstrapping is only necessary if you have decided to obtain the full version
controlled history of Gnuastro, see @ref{Version controlled source} and
@ref{Bootstrapping}.
-Using the version controlled source enables you to always be up to date with
the most recent development work of Gnuastro (bug fixes, new functionalities,
improved algorithms, etc.).
-If you have downloaded a tarball (see @ref{Downloading the source}), then you
can ignore this subsection.
+$ cat dither.txt
+# Column 1: RA [deg, f64] Right Ascension
+# Column 2: Dec [deg, f64] Declination
+192.721250 41.120556
+192.804583 41.120556
+192.721250 41.203889
+192.637917 41.120556
+192.721250 41.037223
+@end example
-To successfully run the bootstrapping process, there are some additional
dependencies to those discussed in the previous subsections.
-These are low level tools that are used by a large collection of Unix-like
operating systems programs, therefore they are most probably already available
in your system.
-If they are not already installed, you should be able to easily find them in
any GNU/Linux distribution package management system (@command{apt-get},
@command{yum}, @command{pacman}, etc.).
-The short names in parenthesis in @command{typewriter} font after the package
name can be used to search for them in your package manager.
-For the GNU Portability Library, GNU Autoconf Archive and @TeX{} Live, it is
recommended to use the instructions here, not your operating system's package
manager.
+We are now ready to generate the exposure map of the dither pattern above
using the reference image that we made before it.
+Let's put the center of our final stack to be on the center of the galaxy, and
we'll assume the stack has a size of 2 degrees.
+With the second command, you can see the exposure map of the final stack.
+Recall that in this image, each pixel shows the number of input images that
went into it.
-@table @asis
+@example
+$ astscript-dither-simulate dither.txt --output=stack.fits \
+ --img=input/ref.fits --center=$center_ra,$center_dec \
+ --width=2
-@item GNU Portability Library (Gnulib)
-@cindex GNU C library
-@cindex Gnulib: GNU Portability Library
-@cindex GNU Portability Library (Gnulib)
-To ensure portability for a wider range of operating systems (those that do
not include GNU C library, namely glibc), Gnuastro depends on the GNU
portability library, or Gnulib.
-Gnulib keeps a copy of all the functions in glibc, implemented (as much as
possible) to be portable to other operating systems.
-The @file{bootstrap} script can automatically clone Gnulib (as a
@file{gnulib/} directory inside Gnuastro), however, as described in
@ref{Bootstrapping} this is not recommended.
+$ astscript-fits-view stack.fits
+@end example
-The recommended way to bootstrap Gnuastro is to first clone Gnulib and the
Autoconf archives (see below) into a local directory outside of Gnuastro.
-Let's call it @file{DEVDIR}@footnote{If you are not a developer in Gnulib or
Autoconf archives, @file{DEVDIR} can be a directory that you do not backup.
-In this way the large number of files in these projects will not slow down
your backup process or take bandwidth (if you backup to a remote server).}
(which you can set to any directory; preferentially where you keep your other
development projects).
-Currently in Gnuastro, both Gnulib and Autoconf archives have to be cloned in
the same top directory@footnote{If you already have the Autoconf archives in a
separate directory, or cannot clone it in the same directory as Gnulib, or you
have it with another directory name (not @file{autoconf-archive/}), you can
follow this short step.
-Set @file{AUTOCONFARCHIVES} to your desired address.
-Then define a symbolic link in @file{DEVDIR} with the following command so
Gnuastro's bootstrap script can find it:@*@command{$ ln -s $AUTOCONFARCHIVES
$DEVDIR/autoconf-archive}.} like the case here@footnote{If your internet
connection is active, but Git complains about the network, it might be due to
your network setup not recognizing the git protocol.
-In that case use the following URL for the HTTP protocol instead (for Autoconf
archives, replace the name): @command{http://git.sv.gnu.org/r/gnulib.git}}:
+Because the step size is so small (compared to the field of view), we see that
except for a thin boundary, we almost have 5 exposures over the full field of
view.
+Let's see what the width of the deepest part of the image is.
+First, we'll use Arithmetic to set all pixels that contain less than 5
exposures to NaN (the outer pixels).
+In the same Arithmetic command, we'll trim all the blank rows and columns to
only contain the non-blank region.
+Afterwards, we'll view the deep region with the second command.
+Finally, with the third command below, we'll use the @option{--skycoverage}
option of the Fits program to see the coverage of deep part on the sky.
@example
-$ DEVDIR=/home/yourname/Development ## Select any location.
-$ mkdir $DEVDIR ## If it doesn't exist!
-$ cd $DEVDIR
-$ git clone https://git.sv.gnu.org/git/gnulib.git
-$ git clone https://git.sv.gnu.org/git/autoconf-archive.git
-@end example
+$ deep_thresh=5
+$ astarithmetic stack.fits set-s s s $deep_thresh lt nan where trim \
+ --output=deep.fits
-Gnulib is a source-based dependency of Gnuastro's bootstrapping process, so
simply having it is enough on your computer, there is no need to install, and
thus check anything.
+$ astscript-fits-view deep.fits
-@noindent
-You now have the full version controlled source of these two repositories in
separate directories.
-Both these packages are regularly updated, so every once in a while, you can
run @command{$ git pull} within them to get any possible updates.
+$ astfits deep.fits --skycoverage
+...
+Sky coverage by center and (full) width:
+ Center: 192.72125 41.120556
+ Width: 1.880835157 1.392461166
+...
+@end example
-@item GNU Automake (@command{automake})
-@cindex GNU Automake
-GNU Automake will build the @file{Makefile.in} files in each sub-directory
using the (hand-written) @file{Makefile.am} files.
-The @file{Makefile.in}s are subsequently used to generate the @file{Makefile}s
when the user runs @command{./configure} before building.
+@cindex Sky value
+@cindex Flat field
+As we see, the width of this deep field is about 1.4 degrees (in Dec; the
coverage in RA depends on the Dec).
+This nicely covers the outers parts of M94.
+However, there is a problem: with a step size of 1 arc-minute, the brighter
central parts of this large galaxy will always be on very similar pixels;
making it hard to calibrate those pixels properly.
+If you are interested in the low surface brightness parts of this galaxy, it
is even worse: the outer parts of the galaxy will always cover similar parts of
the detector in all the exposures.
+To be able to accurately calibrate the image (in particular to estimate the
flat field pattern and subtract the sky), you do not want this to happen!
+You want each exposure to cover very different sources of astrophysical
signal, so you can accurately calibrate the instrument (for example flat field)
and natural (for example the Sky) artifacts.
-To check that you have a working GNU Automake in your system, you can try this
command:
+Before we consider other alternatives, let's first get an accurate measure of
the area that is covered by all 5 exposures.
+A first hint would be to simply multiply the widths along RA and Dec reported
above: @mymath{1.8808\times1.3924=2.6189} degrees squared.
+
+However, the sky coverage reported above has two caveates:
+1) it doesn't take into account the blank pixels (NaN) that are on the four
corners of the @file{deep.fits} image.
+2) the differing area of the pixels on the spherical sky in relation to those
blank values can result in wrong estimations of the area.
+So far, these blank areas are very small (and do not constitute a large
portion of the image).
+As a result, these effects are negligible.
+However, as you get more creative with the dither pattern to optimize your
scientific goal, such blank areas will cover a larger fraction of your final
stack.
+
+So let's get a very accurate estimation of the area that will not be affected
by the issues above.
+With the first command below, we'll use the @option{--pixelareaonwcs} option
of the Fits program that will return the area of each pixel (in pixel units of
degrees squared).
+After running the second command, please have a look at the output of the
first:
@example
-$ automake --version
-@end example
+$ astfits deep.fits --pixelareaonwcs --output=deep-pix-area.fits
-@item GNU Autoconf (@command{autoconf})
-@cindex GNU Autoconf
-GNU Autoconf will build the @file{configure} script using the configurations
we have defined (hand-written) in @file{configure.ac}.
+$ astscript-fits-view deep-pix-area.fits
+@end example
-To check that you have a working GNU Autoconf in your system, you can try this
command:
+@cindex Gnomonic projection
+The gradient you see in this image (that gets slightly curved towards the top
of the image) is the effect of the default
@url{https://en.wikipedia.org/wiki/Gnomonic_projection, Gnomonic projection}
(summarized as @code{TAN} in the FITS WCS standard).
+Since this image is aligned with the celestial coordinates, as we increase the
declination, the pixel area also increases.
+For a comparison, please run the Fits program with the
@option{--pixelareaonwcs} option on the originally downloaded
@file{jplus-1050345.fits.fz} to see the distortion pattern of the camera's
optics which domintes there.
+We can now use Arithmetic to set the areas of all the pixels that were NaN in
@file{deep.fits} and sum all the values to get an accurate estimate of the area
we get from this dither pattern:
@example
-$ autoconf --version
+$ astarithmetic deep-pix-area.fits deep.fits isblank nan where -g1 \
+ sumvalue --quiet
+2.57318473063151e+00
@end example
-@item GNU Autoconf Archive
-@cindex GNU Autoconf Archive
-These are a large collection of tests that can be called to run at
@command{./configure} time.
-See the explanation under GNU Portability Library (Gnulib) above for
instructions on obtaining it and keeping it up to date.
+As expected, this is very close to the simple multiplication that we did above.
+But it will allow us to accurately estimate the size of the deep field with
any ditter pattern.
-GNU Autoconf Archive is a source-based dependency of Gnuastro's bootstrapping
process, so simply having it is enough on your computer, there is no need to
install, and thus check anything.
-Just do not forget that it has to be in the same directory as Gnulib
(described above).
+As mentioned above, M94 is about half a degree in diameter; so let's set
@code{step_arcmin=15}.
+This is one quarter of a degree and will put the center of the four exposures
on the four corners of the M94's main ring.
+You are now ready to repeat the commands above with this changed value.
-@item GNU Texinfo (@command{texinfo})
-@cindex GNU Texinfo
-GNU Texinfo is the tool that formats this manual into the various output
formats.
-To bootstrap Gnuastro you need all of Texinfo's command-line programs.
-However, some operating systems package them separately, for example, in
Fedora, @command{makeinfo} is packaged in the @command{texinfo-tex} package.
+@cartouche
+@noindent
+@strong{Writing scripts:}
+It is better to write the steps above as a script so you can easily change the
basic settings and see the output fast.
+For more on writing scripts, see as described in @ref{Writing scripts to
automate the steps}.
+@end cartouche
-To check that you have a working GNU Texinfo in your system, you can try this
command:
+After you run the commands above with this single change, you will get a total
area of 2.1597 degrees squared.
+This is just roughly @mymath{15\%} smaller than the previous area; but it is
much more easier to calibrate.
+However, since each pointing's center will fall on one edge of the galaxy, M94
will be present in all the exposures while doing the calibrations.
+We already see a large ring around this galaxy, and when we do a low surface
brightness optimized reduction, there is a chance that the size of the galaxy
is much larger.
-@example
-$ makeinfo --version
-@end example
+Ideally, you want your target to be on the four edges/corners of each image.
+This will make sure that a large fraction of each exposure will not be covered
by your final target, allowing you to clibrate much more accurately.
+Let's try setting @code{step_arcmin=40} (almost half the width of the
detector).
+You will notice that the area is now 0.05013 degrees squared!
+This is 51 times smaller!
-@item GNU Libtool (@command{libtool})
-@cindex GNU Libtool
-GNU Libtool is in charge of building all the libraries in Gnuastro.
-The libraries contain functions that are used by more than one program and are
installed for use in other programs.
-They are thus put in a separate directory (@file{lib/}).
+Take a look at @file{deep.fits}, and you will see that it is a horizontally
elongated rectangle!
+To see the cause, have a look at the @file{stack.fits}: the part with 5
exposures is now very small; covered by a cross-like pattern (which is thicker
along the horizontal) that is four exposures deep and even a larger square-like
region which is three exposures deep.
-To check that you have a working GNU Libtool in your system, you can try this
command (and from the output, make sure it is GNU's libtool)
+The difference between 3 exposures and 5 exposures seems a lot at first.
+But let's calculate how much it actually affects the achieved signal-to-noise
ratio and the surface brightness limit (for more, see @ref{Quantifying
measurement limits}).
+The surface brightness limit (or upper-limit surface brightness) are both
calculated by applying the definition of magnitude to the standard deviation of
the background.
+So let's calculate how much this difference in depth affects the sky standard
deviation.
-@example
-$ libtool --version
-@end example
+Deep images will usually be dominated by @ref{Photon counting noise} (or
Poisson noise).
+Therefore, if a single exposure image has a sky standard deviation of
@mymath{\sigma_s}, and we combine @mymath{N} such exposures, the sky standard
deviation on the stack will be @mymath{\sigma_s/\sqrt{N}}.
+As a result, the surface brightness limit between the regions with @mymath{N}
exposures and @mymath{M} exposures differs by @mymath{2.5\times
log_{10}(\sqrt{N/M}) = 1.25\times log_{10}(N/M)} magnitudes.
+If we set @mymath{N=3} and @mymath{M=5}, we get a surface brightness magnitude
difference of 0.27!
-@item GNU help2man (@command{help2man})
-@cindex GNU help2man
-GNU help2man is used to convert the output of the @option{--help} option
-(@ref{--help}) to the traditional Man page (@ref{Man pages}).
+This is a very small difference (given all the other sources of error that
will be present).
+Let's see how much we increase our stack area if we set @code{deep_thresh=3}.
+The newly calculated area is 2.6706 degrees squared!
+This is just slightly larger than the first trial (with @code{step_arcmin=1})!
+Therefore at the cost of decreasing our surface brightness limit by 0.27
magnitudes, we are now able to perfectly calibrate the individual exposures,
and even cover a larger area!
-To check that you have a working GNU Help2man in your system, you can try this
command:
+@cartouche
+@noindent
+@strong{Calibration is very important:} Better calibration can result in a
fainter surface brightness limit than more exposures with poor calibration;
especially for very low surface brightness signal that covers a large area and
is systematically affected by calibrationn issues.
+@end cartouche
+
+Based on the argument above, let's define our deep region to be the pixels
with 3 or more exposures.
+Now, let's lave a look at the horizontally stretched cross that we see for the
regions with 4 exposures.
+The reason that the vertical component is thicker is that the same change in
RA and Dec (defined on a curved sphere) will result in different numbers of
pixels on this flat image pixel grid.
+
+To have the same size in both, we should divide the RA step by the cosine of
the declination.
+In the command below, we have shown the relevant changes in the dither table
construction above.
@example
-$ help2man --version
+$ echo $center_ra $center_dec \
+ | awk '@{s='$step_arcmin'/60; fmt="%-10.6f %-10.6f\n"; \
+ pi=atan2(0, -1); r=pi/180; \
+ printf fmt, $1, $2; \
+ printf fmt, $1+(s/cos($2*r)), $2; \
+ printf fmt, $1, $2+s; \
+ printf fmt, $1-(s/cos($2*r)), $2; \
+ printf fmt, $1, $2-s@}' \
+ >> dither.txt
@end example
+@noindent
+Here are two important points to consider when comparing the previous AWK
command with this one:
+@itemize
+@item
+The cosine function of AWK (@code{cos}) assumes that the input is in radians,
not degrees.
+We therefore have to multiply each declination (in degrees) by a variable
@code{r} that contains the conversion factor (@mymath{\pi/180}).
+@item
+AWK doesn't have the value of @mymath{\pi} in memory.
+We need to calculate it, and to do that, we use the @code{atan2} function (as
recommended in the AWK manual, for its definition in Gnuastro, see
@ref{Trigonometric and hyperbolic operators}).
+@end itemize
-@item @LaTeX{} and some @TeX{} packages
-@cindex @LaTeX{}
-@cindex @TeX{} Live
-Some of the figures in this book are built by @LaTeX{} (using the PGF/TikZ
package).
-The @LaTeX{} source for those figures is version controlled for easy
maintenance not the actual figures.
-So the @file{./boostrap} script will run @LaTeX{} to build the figures.
-The best way to install @LaTeX{} and all the necessary packages is through
@url{https://www.tug.org/texlive/, @TeX{} live} which is a package manager for
@TeX{} related tools that is independent of any operating system.
-It is thus preferred to the @TeX{} Live versions distributed by your operating
system.
-
-To install @TeX{} Live, go to the web page and download the appropriate
installer by following the ``download'' link.
-Note that by default the full package repository will be downloaded and
installed (around 4 Gigabytes) which can take @emph{very} long to download and
to update later.
-However, most packages are not needed by everyone, it is easier, faster and
better to install only the ``Basic scheme'' (consisting of only the most basic
@TeX{} and @LaTeX{} packages, which is less than 200 Mega bytes)@footnote{You
can also download the DVD iso file at a later time to keep as a backup for when
you do not have internet connection if you need a package.}.
+Please use the new AWK command above in your script of the steps above, run it
with everything else unchanged.
+Afterwards, open @file{deep.fits}.
+You will see that the widths of both the horizontal and vertical regions are
the same.
-After the installation, be sure to set the environment variables as suggested
in the end of the outputs.
-Any time you confront (need) a package you do not have, simply install it with
a command like below (similar to how you install software from your operating
system's package manager)@footnote{After running @TeX{}, or @LaTeX{}, you might
get a warning complaining about a @file{missingfile}.
-Run `@command{tlmgr info missingfile}' to see the package(s) containing that
file which you can install.}.
-To install all the necessary @TeX{} packages for a successful Gnuastro
bootstrap, run this command:
+@cartouche
+@noindent
+@strong{RA and Dec should be treated differently:} As shown above, when
considering differences between two points in your dither pattern, it is
important to remember that the RA is only defined on the equator of the
celestial sphere.
+So when you shift @mymath{+\delta} degrees parallel to the equator, from a
point that is located in RA and Dec of [@mymath{r}, @mymath{d}], the RA and Dec
of the new point are [@mymath{r+\delta{}/cos(d), d}].
+@end cartouche
-@example
-$ sudo su
-# tlmgr install epsf jknapltx caption biblatex biber iftex \
- etoolbox logreq xstring xkeyval pgf ms \
- xcolor pgfplots times rsfs ps2eps epspdf
-@end example
+You can try making the cross-like region as thin as possible by slightly
increasing the step size.
+For example, set it to @code{step_arcmin=42}.
+When you open @file{deep.fits}, you will see that the depth across this image
is almost contiguous (which is another positive factor!).
-To check that you have a working @LaTeX{} executable in your system, you can
try this command (this just checks if @LaTeX{} exists, as described above, if
you have a missing package, you can easily identify it from the output and
install it with @command{tlmgr}):
+You can construct any complex dither pattern (with more than 5 points) based
on the logic and reasoning above to help extract the most science from the
valuable telescope time that you will be getting.
+Of course, factors like the optimal exposure time are also critical, but is
was beyond the scope of this tutorial.
-@example
-$ latex --version
-@end example
-@item ImageMagick (@command{imagemagick})
-@cindex ImageMagick
-ImageMagick is a wonderful and robust program for image manipulation on the
command-line.
-@file{bootstrap} uses it to convert the book images into the formats necessary
for the various book formats.
-Since ImageMagick version 7, it is necessary to edit the policy file
(@file{/etc/ImageMagick-7/policy.xml}) to have the following line (it maybe
present, but commented, in this case un-comment it):
-@example
-<policy domain="coder" rights="read|write" pattern="@{PS,PDF,XPS@}"/>
-@end example
+@node Installation, Common program behavior, Tutorials, Top
+@chapter Installation
-If the following line is present, it is also necessary to comment/remove it.
+@c This link is put here because the `Quick start' section of the first
+@c chapter is not the most eye-catching part of the manual and some users
+@c were seen to follow this ``Installation'' chapter title in search of the
+@c tarball and fast instructions.
+@cindex Installation
+The latest released version of Gnuastro source code is always available at the
following URL:
-@example
-<policy domain="delegate" rights="none" pattern="gs" />
-@end example
+@url{http://ftpmirror.gnu.org/gnuastro/gnuastro-latest.tar.gz}
-To learn more about the ImageMagick security policy please see:
@url{https://imagemagick.org/script/security-policy.php}.
+@noindent
+@ref{Quick start} describes the commands necessary to configure, build, and
install Gnuastro on your system.
+This chapter will be useful in cases where the simple procedure above is not
sufficient, for example, your system lacks a mandatory/optional dependency (in
other words, you cannot pass the @command{$ ./configure} step), or you want
greater customization, or you want to build and install Gnuastro from other
random points in its history, or you want a higher level of control on the
installation.
+Thus if you were happy with downloading the tarball and following @ref{Quick
start}, then you can safely ignore this chapter and come back to it in the
future if you need more customization.
-To check that you have a working ImageMagick in your system, you can try this
command:
+@ref{Dependencies} describes the mandatory, optional and bootstrapping
dependencies of Gnuastro.
+Only the first group are required/mandatory when you are building Gnuastro
using a tarball (see @ref{Release tarball}), they are very basic and low-level
tools used in most astronomical software, so you might already have them
installed, if not they are very easy to install as described for each.
+@ref{Downloading the source} discusses the two methods you can obtain the
source code: as a tarball (a significant snapshot in Gnuastro's history), or
the full history@footnote{@ref{Bootstrapping dependencies} are required if you
clone the full history.}.
+The latter allows you to build Gnuastro at any random point in its history
(for example, to get bug fixes or new features that are not released as a
tarball yet).
-@example
-$ convert --version
-@end example
+The building and installation of Gnuastro is heavily customizable, to learn
more about them, see @ref{Build and install}.
+This section is essentially a thorough explanation of the steps in @ref{Quick
start}.
+It discusses ways you can influence the building and installation.
+If you encounter any problems in the installation process, it is probably
already explained in @ref{Known issues}.
+In @ref{Other useful software} the installation and usage of some other free
software that are not directly required by Gnuastro but might be useful in
conjunction with it is discussed.
-@end table
+@menu
+* Dependencies:: Necessary packages for Gnuastro.
+* Downloading the source:: Ways to download the source code.
+* Build and install:: Configure, build and install Gnuastro.
+@end menu
-@node Dependencies from package managers, , Bootstrapping dependencies,
Dependencies
-@subsection Dependencies from package managers
-@cindex Package managers
-@cindex Source code building
-@cindex Building from source
-@cindex Compiling from source
-@cindex Source code compilation
-@cindex Distributions, GNU/Linux
-The most basic way to install a package on your system is to build the
packages from source yourself.
-Alternatively, you can use your operating system's package manager to download
pre-compiled files and install them.
-The latter choice is easier and faster.
-However, we recommend that you build the @ref{Mandatory dependencies} yourself
from source (all necessary commands and links are given in the respective
section).
-Here are some basic reasons behind this recommendation.
+@node Dependencies, Downloading the source, Installation, Installation
+@section Dependencies
-@enumerate
+A minimal set of dependencies are mandatory for building Gnuastro from the
standard tarball release.
+If they are not present you cannot pass Gnuastro's configuration step.
+The mandatory dependencies are therefore very basic (low-level) tools which
are easy to obtain, build and install, see @ref{Mandatory dependencies} for a
full discussion.
-@item
-Your operating system's pre-built software might not be the most recent
release.
-For example, Gnuastro itself is also packaged in some package managers.
-For the list see: @url{https://repology.org/project/gnuastro/versions}.
-You will notice that Gnuastro's version in some operating systems is more than
10 versions old!
-It is the same for all the dependencies of Gnuastro.
+If you have the packages of @ref{Optional dependencies}, Gnuastro will have
additional functionality (for example, converting FITS images to JPEG or PDF).
+If you are installing from a tarball as explained in @ref{Quick start}, you
can stop reading after this section.
+If you are cloning the version controlled source (see @ref{Version controlled
source}), an additional bootstrapping step is required before configuration and
its dependencies are explained in @ref{Bootstrapping dependencies}.
-@item
-For each package, Gnuastro might preform better (or require) certain
configuration options that your distribution's package managers did not add for
you.
-If present, these configuration options are explained during the installation
of each in the sections below (for example, in @ref{CFITSIO}).
-When the proper configuration has not been set, the programs should complain
and inform you.
+Your operating system's package manager is an easy and convenient way to
download and install the dependencies that are already pre-built for your
operating system.
+In @ref{Dependencies from package managers}, we will list some common
operating system package manager commands to install the optional and mandatory
dependencies.
-@item
-For the libraries, they might separate the binary file from the header files
which can cause confusion, see @ref{Known issues}.
+@menu
+* Mandatory dependencies:: Gnuastro will not install without these.
+* Optional dependencies:: Adding more functionality.
+* Bootstrapping dependencies:: If you have the version controlled source.
+* Dependencies from package managers:: Installing from OS package managers.
+@end menu
-@item
-Like any other tool, the science you derive from Gnuastro's tools highly
depend on these lower level dependencies, so generally it is much better to
have a close connection with them.
-By reading their manuals, installing them and staying up to date with
changes/bugs in them, your scientific results and understanding (of what is
going on, and thus how you interpret your scientific results) will also
correspondingly improve.
-@end enumerate
+@node Mandatory dependencies, Optional dependencies, Dependencies, Dependencies
+@subsection Mandatory dependencies
-Based on your package manager, you can use any of the following commands to
install the mandatory and optional dependencies.
-If your package manager is not included in the list below, please send us the
respective command, so we add it.
-For better archivability and compression ratios, Gnuastro's recommended
tarball compression format is with the @url{http://lzip.nongnu.org/lzip.html,
Lzip} program, see @ref{Release tarball}.
-Therefore, the package manager commands below also contain Lzip.
+@cindex Dependencies, Gnuastro
+@cindex GNU build system
+The mandatory Gnuastro dependencies are very basic and low-level tools.
+They all follow the same basic GNU based build system (like that shown in
@ref{Quick start}), so even if you do not have them, installing them should be
pretty straightforward.
+In this section we explain each program and any specific note that might be
necessary in the installation.
-@table @asis
-@item @command{apt-get} (Debian-based OSs: Debian, Ubuntu, Linux Mint, etc.)
-@cindex Debian
-@cindex Ubuntu
-@cindex Linux Mint
-@cindex @command{apt-get}
-@cindex Advanced Packaging Tool (APT, Debian)
-@url{https://en.wikipedia.org/wiki/Debian,Debian} is one of the oldest
-GNU/Linux
-distributions@footnote{@url{https://en.wikipedia.org/wiki/List_of_Linux_distributions#Debian-based}}.
-It thus has a very extended user community and a robust internal structure and
standards.
-All of it is free software and based on the work of volunteers around the
world.
-Many distributions are thus derived from it, for example, Ubuntu and Linux
Mint.
-This arguably makes Debian-based OSs the largest, and most used, class of
GNU/Linux distributions.
-All of them use Debian's Advanced Packaging Tool (APT, for example,
@command{apt-get}) for managing packages.
-@table @asis
-@item Mandatory dependencies
-Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
-@example
-$ sudo apt-get install libgsl-dev libcfitsio-dev \
- wcslib-dev
-@end example
+@menu
+* GNU Scientific Library:: Installing GSL.
+* CFITSIO:: C interface to the FITS standard.
+* WCSLIB:: C interface to the WCS standard of FITS.
+@end menu
-@item Optional dependencies
-If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
-@example
-$ sudo apt-get install ghostscript libtool-bin \
- libjpeg-dev libtiff-dev
- libgit2-dev curl lzip
-@end example
+@node GNU Scientific Library, CFITSIO, Mandatory dependencies, Mandatory
dependencies
+@subsubsection GNU Scientific Library
+
+@cindex GNU Scientific Library
+The @url{http://www.gnu.org/software/gsl/, GNU Scientific Library}, or GSL, is
a large collection of functions that are very useful in scientific
applications, for example, integration, random number generation, and Fast
Fourier Transform among many others.
+To download and install GSL from source, you can run the following commands.
-@item Programs to view FITS images or tables
-These are not used in Gnuastro's build.
-They can just help in viewing the inputs/outputs independent of Gnuastro!
@example
-$ sudo apt-get install saods9 topcat
+$ wget http://ftpmirror.gnu.org/gsl/gsl-latest.tar.gz
+$ tar xf gsl-latest.tar.gz
+$ cd gsl-X.X # Replace X.X with version number.
+$ ./configure CFLAGS="$CFLAGS -g0 -O3"
+$ make -j8 # Replace 8 with no. CPU threads.
+$ make check
+$ sudo make install
@end example
-@end table
-
-@noindent
-Gnuastro is @url{https://tracker.debian.org/pkg/gnuastro,packaged} in Debian
(and thus some of its derivate operating systems).
-Just make sure it is the most recent version.
-@item @command{dnf}
-@itemx @command{yum} (Red Hat-based OSs: Red Hat, Fedora, CentOS, Scientific
Linux, etc.)
-@cindex RHEL
-@cindex Fedora
-@cindex CentOS
-@cindex Red Hat
-@cindex @command{dnf}
-@cindex @command{yum}
-@cindex Scientific Linux
-@url{https://en.wikipedia.org/wiki/Red_Hat,Red Hat Enterprise Linux} (RHEL) is
released by Red Hat Inc.
-RHEL requires paid subscriptions for use of its binaries and support.
-But since it is free software, many other teams use its code to spin-off their
own distributions based on RHEL.
-Red Hat-based GNU/Linux distributions initially used the ``Yellowdog Updated,
Modifier'' (YUM) package manager, which has been replaced by ``Dandified yum''
(DNF).
-If the latter is not available on your system, you can use @command{yum}
instead of @command{dnf} in the command below.
+@node CFITSIO, WCSLIB, GNU Scientific Library, Mandatory dependencies
+@subsubsection CFITSIO
-@table @asis
-@item Mandatory dependencies
-Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
-@example
-$ sudo dnf install gsl-devel cfitsio-devel \
- wcslib-devel
-@end example
+@cindex CFITSIO
+@cindex FITS standard
+@url{http://heasarc.gsfc.nasa.gov/fitsio/, CFITSIO} is the closest you can get
to the pixels in a FITS image while remaining faithful to the
@url{http://fits.gsfc.nasa.gov/fits_standard.html, FITS standard}.
+It is written by William Pence, the principal author of the FITS
standard@footnote{Pence, W.D. et al. Definition of the Flexible Image Transport
System (FITS), version 3.0. (2010) Astronomy and Astrophysics, Volume 524,
id.A42, 40 pp.}, and is regularly updated.
+Setting the definitions for all other software packages using FITS images.
-@item Optional dependencies
-If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
-@example
-$ sudo dnf install ghostscript libtool \
- libjpeg-devel libtiff-devel \
- libgit2-devel lzip curl
-@end example
+@vindex --enable-reentrant
+@cindex Reentrancy, multiple file opening
+@cindex Multiple file opening, reentrancy
+Some GNU/Linux distributions have CFITSIO in their package managers, if it is
available and updated, you can use it.
+One problem that might occur is that CFITSIO might not be configured with the
@option{--enable-reentrant} option by the distribution.
+This option allows CFITSIO to open a file in multiple threads, it can thus
provide great speed improvements.
+If CFITSIO was not configured with this option, any program which needs this
capability will warn you and abort when you ask for multiple threads (see
@ref{Multi-threaded operations}).
-@item Programs to view FITS images or tables
-These are not used in Gnuastro's build.
-They can just help in viewing the inputs/outputs independent of Gnuastro!
-@example
-$ sudo dnf install saods9 topcat
-@end example
-@end table
+To install CFITSIO from source, we strongly recommend that you have a look
through Chapter 2 (Creating the CFITSIO library) of the CFITSIO manual and
understand the options you can pass to @command{$ ./configure} (they are not
too much).
+This is a very basic package for most astronomical software and it is best
that you configure it nicely with your system.
+Once you download the source and unpack it, the following configure script
should be enough for most purposes.
+Do Not forget to read chapter two of the manual though, for example, the
second option is only for 64bit systems.
+The manual also explains how to check if it has been installed correctly.
-@item @command{brew} (macOS)
-@cindex macOS
-@cindex Homebrew
-@cindex MacPorts
-@cindex @command{brew}
-@url{https://en.wikipedia.org/wiki/MacOS,macOS} is the operating system used
on Apple devices.
-macOS does not come with a package manager pre-installed, but several widely
used, third-party package managers exist, such as Homebrew or MacPorts.
-Both are free software.
-Currently we have only tested Gnuastro's installation with Homebrew as
described below.
-If not already installed, first obtain Homebrew by following the instructions
at @url{https://brew.sh}.
+CFITSIO comes with two executable files called @command{fpack} and
@command{funpack}.
+From their manual: they ``are standalone programs for compressing and
uncompressing images and tables that are stored in the FITS (Flexible Image
Transport System) data format.
+They are analogous to the gzip and gunzip compression programs except that
they are optimized for the types of astronomical images that are often stored
in FITS format''.
+The commands below will compile and install them on your system along with
CFITSIO.
+They are not essential for Gnuastro, since they are just wrappers for
functions within CFITSIO, but they can come in handy.
+The @command{make utils} command is only available for versions above 3.39, it
will build these executable files along with several other executable test
files which are deleted in the following commands before the installation
(otherwise the test files will also be installed).
-@table @asis
-@item Mandatory dependencies
-Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
+The commands necessary to download the source, decompress, build and install
CFITSIO from source are described below.
-Homebrew manages packages in different `taps'.
-To install WCSLIB via Homebrew you will need to @command{tap} into
@command{brewsci/science} first (the tap may change in the future, but can be
found by calling @command{brew search wcslib}).
@example
-$ brew tap brewsci/science
-$ brew install wcslib gsl cfitsio
+$ urlbase=http://heasarc.gsfc.nasa.gov/FTP/software/fitsio/c
+$ wget $urlbase/cfitsio_latest.tar.gz
+$ tar xf cfitsio_latest.tar.gz
+$ cd cfitsio-X.XX # Replace X.XX with version
+$ ./configure --prefix=/usr/local --enable-sse2 --enable-reentrant \
+ CFLAGS="$CFLAGS -g0 -O3"
+$ make
+$ make utils
+$ ./testprog > testprog.lis # See below if this has an error
+$ diff testprog.lis testprog.out # Should have no output
+$ cmp testprog.fit testprog.std # Should have no output
+$ rm cookbook fitscopy imcopy smem speed testprog
+$ sudo make install
@end example
-@item Optional dependencies
-If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
-@example
-$ brew install ghostscript libtool libjpeg \
- libtiff libgit2 curl lzip
-@end example
+In the @code{./testprog > testprog.lis} step, you may confront an error,
complaining that it cannot find @file{libcfitsio.so.AAA} (where @code{AAA} is
an integer).
+This is the library that you just built and have not yet installed.
+But unfortunately some versions of CFITSIO do not account for this on some OSs.
+To fix the problem, you need to tell your OS to also look into current CFITSIO
build directory with the first command below, afterwards, the problematic
command (second below) should run properly.
-@item Programs to view FITS images or tables
-These are not used in Gnuastro's build.
-They can just help in viewing the inputs/outputs independent of Gnuastro!
@example
-$ brew install saoimageds9 topcat
+$ export LD_LIBRARY_PATH="$(pwd):$LD_LIBRARY_PATH"
+$ ./testprog > testprog.lis
@end example
-@end table
-@item @command{pacman} (Arch Linux)
-@cindex Arch GNU/Linux
-@cindex @command{pacman}
-@url{https://en.wikipedia.org/wiki/Arch_Linux,Arch Linux} is a smaller
GNU/Linux distribution, which follows the KISS principle (``keep it simple,
stupid'') as a general guideline.
-It ``focuses on elegance, code correctness, minimalism and simplicity, and
expects the user to be willing to make some effort to understand the system's
operation''.
-Arch GNU/Linux uses ``Package manager'' (Pacman) to manage its
packages/components.
+Recall that the modification above is ONLY NECESSARY FOR THIS STEP.
+@emph{Do Not} put the @code{LD_LIBRARY_PATH} modification command in a
permanent place (like your bash startup file).
+After installing CFITSIO, close your terminal and continue working on a new
terminal (so @code{LD_LIBRARY_PATH} has its default value).
+For more on @code{LD_LIBRARY_PATH}, see @ref{Installation directory}.
-@table @asis
-@item Mandatory dependencies
-Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
-@example
-$ sudo pacman -S gsl cfitsio wcslib
-@end example
-@item Optional dependencies
-If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
-@example
-$ sudo pacman -S ghostscript libtool libjpeg \
- libtiff libgit2 curl lzip
-@end example
-@item Programs to view FITS images or tables
-These are not used in Gnuastro's build.
-They can just help in viewing the inputs/outputs independent of Gnuastro!
-SAO DS9 and TOPCAT are not available in the standard Arch GNU/Linux
repositories.
-However, installing and using both is very easy from their own web pages, as
described in @ref{SAO DS9} and @ref{TOPCAT}.
-@end table
-@item @command{zypper} (openSUSE and SUSE Linux Enterprise Server)
-@cindex openSUSE
-@cindex SUSE Linux Enterprise Server
-@cindex @command{zypper}, OpenSUSE package manager
-SUSE Linux Enterprise
Server@footnote{@url{https://www.suse.com/products/server}} (SLES) is the
commercial offering which shares code and tools.
-Many additional packages are offered in the Build
Service@footnote{@url{https://build.opensuse.org}}.
-openSUSE and SLES use @command{zypper} (cli) and YaST (GUI) for managing
repositories and packages.
+@node WCSLIB, , CFITSIO, Mandatory dependencies
+@subsubsection WCSLIB
-@table @asis
-@item Configuration
-When building Gnuastro, run the configure script with the following
@code{CPPFLAGS} environment variable:
+@cindex WCS
+@cindex WCSLIB
+@cindex World Coordinate System
+@url{http://www.atnf.csiro.au/people/mcalabre/WCS/, WCSLIB} is written and
maintained by one of the authors of the World Coordinate System (WCS)
definition in the @url{http://fits.gsfc.nasa.gov/fits_standard.html, FITS
standard}@footnote{Greisen E.W., Calabretta M.R. (2002) Representation of world
coordinates in FITS.
+Astronomy and Astrophysics, 395, 1061-1075.}, Mark Calabretta.
+It might be already built and ready in your distribution's package management
system.
+However, here the installation from source is explained, for the advantages of
installation from source please see @ref{Mandatory dependencies}.
+To install WCSLIB you will need to have CFITSIO already installed, see
@ref{CFITSIO}.
-@example
-$ ./configure CPPFLAGS="-I/usr/include/cfitsio"
-@end example
+@vindex --without-pgplot
+WCSLIB also has plotting capabilities which use PGPLOT (a plotting library for
C).
+If you wan to use those capabilities in WCSLIB, @ref{PGPLOT} provides the
PGPLOT installation instructions.
+However PGPLOT is old@footnote{As of early June 2016, its most recent version
was uploaded in February 2001.}, so its installation is not easy, there are
also many great modern WCS plotting tools (mostly in written in Python).
+Hence, if you will not be using those plotting functions in WCSLIB, you can
configure it with the @option{--without-pgplot} option as shown below.
-@item Mandatory dependencies
-Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
-@example
-$ sudo zypper install gsl-devel cfitsio-devel \
- wcslib-devel
-@end example
+If you have the cURL library @footnote{@url{https://curl.haxx.se}} on your
system and you installed CFITSIO version 3.42 or later, you will need to also
link with the cURL library at configure time (through the @code{-lcurl} option
as shown below).
+CFITSIO uses the cURL library for its HTTPS (or HTTP
Secure@footnote{@url{https://en.wikipedia.org/wiki/HTTPS}}) support and if it
is present on your system, CFITSIO will depend on it.
+Therefore, if @command{./configure} command below fails (you do not have the
cURL library), then remove this option and rerun it.
-@item Optional dependencies
-If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
+To download, configure, build, check and install WCSLIB from source, you can
follow the steps below.
@example
-$ sudo zypper install ghostscript_any libtool \
- pkgconfig libcurl-devel \
- libgit2-devel \
- libjpeg62-devel \
- libtiff-devel curl
-@end example
+## Download and unpack the source tarball
+$ wget ftp://ftp.atnf.csiro.au/pub/software/wcslib/wcslib.tar.bz2
+$ tar xf wcslib.tar.bz2
-@item Programs to view FITS images or tables
-These are not used in Gnuastro's build.
-They can just help in viewing the inputs/outputs independent of Gnuastro!
-@example
-$ sudo zypper install ds9 topcat
-@end example
-@end table
-
-@c Gnuastro is @url{https://software.opensuse.org/package/gnuastro,packaged}
-@c in @command{zypper}. Just make sure it is the most recent version.
-@end table
-
-Usually, when libraries are installed by operating system package managers,
there should be no problems when configuring and building other programs from
source (that depend on the libraries: Gnuastro in this case).
-However, in some special conditions, problems may pop-up during the
configuration, building, or checking/running any of Gnuastro's programs.
-The most common of such problems and their solution are discussed below.
+## In the `cd' command, replace `X.X' with version number.
+$ cd wcslib-X.X
-@cartouche
-@noindent
-@strong{Not finding library during configuration:} If a library is installed,
but during Gnuastro's @command{configure} step the library is not found, then
configure Gnuastro like the command below (correcting @file{/path/to/lib}).
-For more, see @ref{Known issues} and @ref{Installation directory}.
-@example
-$ ./configure LDFLAGS="-L/path/to/lib"
+## If `./configure' fails, remove `-lcurl' and run again.
+$ ./configure LIBS="-pthread -lcurl -lm" --without-pgplot \
+ --disable-fortran CFLAGS="$CFLAGS -g0 -O3"
+$ make
+$ make check
+$ sudo make install
@end example
-@end cartouche
-@cartouche
-@noindent
-@strong{Not finding header (.h) files while building:} If a library is
installed, but during Gnuastro's @command{make} step, the library's header
(file with a @file{.h} suffix) is not found, then configure Gnuastro like the
command below (correcting @file{/path/to/include}).
-For more, see @ref{Known issues} and @ref{Installation directory}.
-@example
-$ ./configure CPPFLAGS="-I/path/to/include"
-@end example
-@end cartouche
-@cartouche
-@noindent
-@strong{Gnuastro's programs do not run during check or after install:}
-If a library is installed, but the programs do not run due to linking
problems, set the @code{LD_LIBRARY_PATH} variable like below (assuming Gnuastro
is installed in @file{/path/to/installed}).
-For more, see @ref{Known issues} and @ref{Installation directory}.
-@example
-$ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/path/to/installed/lib"
-@end example
-@end cartouche
+@node Optional dependencies, Bootstrapping dependencies, Mandatory
dependencies, Dependencies
+@subsection Optional dependencies
+The libraries listed here are only used for very specific applications,
therefore they are optional and Gnuastro can be built without them (with only
those specific features disabled).
+Since these are pretty low-level tools, they are not too hard to install from
source, but you can also use your operating system's package manager to easily
install all of them.
+For more, see @ref{Dependencies from package managers}.
+@cindex GPL Ghostscript
+If the @command{./configure} script cannot find any of these optional
dependencies, it will notify you of the operation(s) you cannot do due to not
having them.
+If you continue the build and request an operation that uses a missing
library, Gnuastro's programs will warn that the optional library was missing at
build-time and abort.
+Since Gnuastro was built without that library, installing the library
afterwards will not help.
+The only way is to re-build Gnuastro from scratch (after the library has been
installed).
+However, for program dependencies (like cURL or Ghostscript) things are
easier: you can install them after building Gnuastro also.
+This is because libraries are used to build the internal structure of
Gnuastro's executables.
+However, a program dependency is called by Gnuastro's programs at run-time and
has no effect on their internal structure.
+So if a dependency program becomes available later, it will be used next time
it is requested.
+@table @asis
+@item GNU Libtool
+@cindex GNU Libtool
+Libtool is a program to simplify managing of the libraries to build an
executable (a program).
+GNU Libtool has some added functionality compared to other implementations.
+If GNU Libtool is not present on your system at configuration time, a warning
will be printed and @ref{BuildProgram} will not be built or installed.
+The configure script will look into your search path (@code{PATH}) for GNU
Libtool through the following executable names: @command{libtool} (acceptable
only if it is the GNU implementation) or @command{glibtool}.
+See @ref{Installation directory} for more on @code{PATH}.
+GNU Libtool (the binary/executable file) is a low-level program that is
probably already present on your system, and if not, is available in your
operating system package manager@footnote{Note that we want the
binary/executable Libtool program which can be run on the command-line.
+In Debian-based operating systems which separate various parts of a package,
you want want @code{libtool-bin}, the @code{libtool} package will not contain
the executable program.}.
+If you want to install GNU Libtool's latest version from source, please visit
its @url{https://www.gnu.org/software/libtool/, web page}.
+Gnuastro's tarball is shipped with an internal implementation of GNU Libtool.
+Even if you have GNU Libtool, Gnuastro's internal implementation is used for
the building and installation of Gnuastro.
+As a result, you can still build, install and use Gnuastro even if you do not
have GNU Libtool installed on your system.
+However, this internal Libtool does not get installed.
+Therefore, after Gnuastro's installation, if you want to use
@ref{BuildProgram} to compile and link your own C source code which uses the
@ref{Gnuastro library}, you need to have GNU Libtool available on your system
(independent of Gnuastro).
+See @ref{Review of library fundamentals} to learn more about libraries.
+@item GNU Make extension headers
+@cindex GNU Make
+GNU Make is a workflow management system that can be used to run a series of
commands in a specific order, and in parallel if you want.
+GNU Make offers special features to extend it with custom functions within a
dynamic library.
+They are defined in the @file{gnumake.h} header.
+If @file{gnumake.h} can be found on your system at configuration time,
Gnuastro will build a custom library that GNU Make can use for extended
functionality in (astronomical) data analysis scenarios.
-@node Downloading the source, Build and install, Dependencies, Installation
-@section Downloading the source
+@item libgit2
+@cindex Git
+@pindex libgit2
+@cindex Version control systems
+Git is one of the most common version control systems (see @ref{Version
controlled source}).
+When @file{libgit2} is present, and Gnuastro's programs are run within a
version controlled directory, outputs will contain the version number of the
working directory's repository for future reproducibility.
+See the @command{COMMIT} keyword header in @ref{Output FITS files} for a
discussion.
-Gnuastro's source code can be downloaded in two ways.
-As a tarball, ready to be configured and installed on your system (as
described in @ref{Quick start}), see @ref{Release tarball}.
-If you want official releases of stable versions this is the best, easiest and
most common option.
-Alternatively, you can clone the version controlled history of Gnuastro, run
one extra bootstrapping step and then follow the same steps as the tarball.
-This will give you access to all the most recent work that will be included in
the next release along with the full project history.
-The process is thoroughly introduced in @ref{Version controlled source}.
+@item libjpeg
+@pindex libjpeg
+@cindex JPEG format
+libjpeg is only used by ConvertType to read from and write to JPEG images, see
@ref{Recognized file formats}.
+@url{http://www.ijg.org/, libjpeg} is a very basic library that provides tools
to read and write JPEG images, most Unix-like graphic programs and libraries
use it.
+Therefore you most probably already have it installed.
+@url{http://libjpeg-turbo.virtualgl.org/, libjpeg-turbo} is an alternative to
libjpeg.
+It uses Single instruction, multiple data (SIMD) instructions for ARM based
systems that significantly decreases the processing time of JPEG compression
and decompression algorithms.
+@item libtiff
+@pindex libtiff
+@cindex TIFF format
+libtiff is used by ConvertType and the libraries to read TIFF images, see
@ref{Recognized file formats}.
+@url{http://www.simplesystems.org/libtiff/, libtiff} is a very basic library
that provides tools to read and write TIFF images, most Unix-like operating
system graphic programs and libraries use it.
+Therefore even if you do not have it installed, it must be easily available in
your package manager.
+@item cURL
+@cindex cURL (downloading tool)
+cURL's executable (@command{curl}) is called by @ref{Query} for submitting
queries to remote datasets and retrieving the results.
+It is not necessary for the build of Gnuastro from source (only a warning will
be printed if it cannot be found at configure time), so if you do not have it
at build-time there is no problem.
+Just be sure to have it when you run @command{astquery}, otherwise you'll get
an error about not finding @command{curl}.
-@menu
-* Release tarball:: Download a stable official release.
-* Version controlled source:: Get and use the version controlled source.
-@end menu
+@item GPL Ghostscript
+@cindex GPL Ghostscript
+GPL Ghostscript's executable (@command{gs}) is called by ConvertType to
compile a PDF file from a source PostScript file, see @ref{ConvertType}.
+Therefore its headers (and libraries) are not needed.
-@node Release tarball, Version controlled source, Downloading the source,
Downloading the source
-@subsection Release tarball
+@item Python3 with Numpy
+@cindex Numpy
+@cindex Python3
+Python is a high-level programming language and Numpy is the most commonly
used library within Python to add multi-dimensional arrays and matrices.
+If you configure Gnuastro with @option{--with-python} @emph{and} version 3 of
Python is available with a corresponding Numpy Library, Gnuastro's library will
be built with some Python-related helper functions.
+Python wrappers for Gnuastro's library (for example, `pyGnuastro') can use
these functions when being built from source.
+For more on Gnuastro's Python helper functions, see @ref{Python interface}.
-A release tarball (commonly compressed) is the most common way of obtaining
free and open source software.
-A tarball is a snapshot of one particular moment in the Gnuastro development
history along with all the necessary files to configure, build, and install
Gnuastro easily (see @ref{Quick start}).
-It is very straightforward and needs the least set of dependencies (see
@ref{Mandatory dependencies}).
-Gnuastro has tarballs for official stable releases and pre-releases for
testing.
-See @ref{Version numbering} for more on the two types of releases and the
formats of the version numbers.
-The URLs for each type of release are given below.
+@cindex PyPI
+This Python interface is only relevant if you want to build the Python
wrappers (like `pyGnuastro') from source.
+If you install the Gnuastro Python wrapper from a pre-built repository like
PyPI, this feature of your Gnuastro library won't be used.
+Pre-built libraries contain the full Gnuastro library that they need within
them (you don't even need to have Gnuastro at all!).
-@table @asis
+@cartouche
+@noindent
+@strong{Can't find the Python3 and Numpy of a virtual environment:} make sure
to set the @code{$PYTHON} variable to point to the @code{python3} command of
the virtual environment before running @code{./configure}.
+Note that you don't need to activate the virtual env, just point @code{PYTHON}
to its Python3 executable, like the example below:
-@item Official stable releases (@url{http://ftp.gnu.org/gnu/gnuastro}):
-This URL hosts the official stable releases of Gnuastro.
-Always use the most recent version (see @ref{Version numbering}).
-By clicking on the ``Last modified'' title of the second column, the files
will be sorted by their date which you can also use to find the latest version.
-It is recommended to use a mirror to download these tarballs, please visit
@url{http://ftpmirror.gnu.org/gnuastro/} and see below.
+@example
+$ python3 -m venv test-env # Setting up the virtual env.
+$ export PYTHON="$(pwd)/test-env/bin/python3"
+$ ./configure # Gnuastro's configure script.
+@end example
+@end cartouche
-@item Pre-release tarballs (@url{http://alpha.gnu.org/gnu/gnuastro}):
-This URL contains unofficial pre-release versions of Gnuastro.
-The pre-release versions of Gnuastro here are for enthusiasts to try out
before an official release.
-If there are problems, or bugs then the testers will inform the developers to
fix before the next official release.
-See @ref{Version numbering} to understand how the version numbers here are
formatted.
-If you want to remain even more up-to-date with the developing activities,
please clone the version controlled source as described in @ref{Version
controlled source}.
+@item SAO DS9
+SAO DS9 (@command{ds9}) is a visualization tool for FITS images.
+Gnuastro's @command{astscript-fits-view} program calls DS9 to visualize FITS
images.
+We have a full appendix on it and how to install it in @ref{SAO DS9}.
+Since it is a run-time dependency, it can be installed at any later time
(after building and installing Gnuastro).
+@item TOPCAT
+TOPCAT (@command{topcat}) is a visualization tool for astronomical tables
(most commonly: plotting).
+Gnuastro's @command{astscript-fits-view} program calls TOPCAT it to visualize
tables.
+We have a full appendix on it and how to install it in @ref{TOPCAT}.
+Since it is a run-time dependency, it can be installed at any later time
(after building and installing Gnuastro).
@end table
-@cindex Gzip
-@cindex Lzip
-Gnuastro's official/stable tarball is released with two formats: Gzip (with
suffix @file{.tar.gz}) and Lzip (with suffix @file{.tar.lz}).
-The pre-release tarballs (after version 0.3) are released only as an Lzip
tarball.
-Gzip is a very well-known and widely used compression program created by GNU
and available in most systems.
-However, Lzip provides a better compression ratio and more robust archival
capacity.
-For example, Gnuastro 0.3's tarball was 2.9MB and 4.3MB with Lzip and Gzip
respectively, see the @url{http://www.nongnu.org/lzip/lzip.html, Lzip web page}
for more.
-Lzip might not be pre-installed in your operating system, if so, installing it
from your operating system's package manager or from source is very easy and
fast (it is a very small program).
-The GNU FTP server is mirrored (has backups) in various locations on the globe
(@url{http://www.gnu.org/order/ftp.html}).
-You can use the closest mirror to your location for a more faster download.
-Note that only some mirrors keep track of the pre-release (alpha) tarballs.
-Also note that if you want to download immediately after and announcement (see
@ref{Announcements}), the mirrors might need some time to synchronize with the
main GNU FTP server.
-@node Version controlled source, , Release tarball, Downloading the source
-@subsection Version controlled source
+@node Bootstrapping dependencies, Dependencies from package managers, Optional
dependencies, Dependencies
+@subsection Bootstrapping dependencies
-@cindex Git
-@cindex Version control
-The publicly distributed Gnuastro tarball (for example,
@file{gnuastro-X.X.tar.gz}) does not contain the revision history, it is only a
snapshot of the source code at one significant instant of Gnuastro's history
(specified by the version number, see @ref{Version numbering}), ready to be
configured and built.
-To be able to develop successfully, the revision history of the code can be
very useful to track when something was added or changed, also some updates
that are not yet officially released might be in it.
+Bootstrapping is only necessary if you have decided to obtain the full version
controlled history of Gnuastro, see @ref{Version controlled source} and
@ref{Bootstrapping}.
+Using the version controlled source enables you to always be up to date with
the most recent development work of Gnuastro (bug fixes, new functionalities,
improved algorithms, etc.).
+If you have downloaded a tarball (see @ref{Downloading the source}), then you
can ignore this subsection.
-We use Git for the version control of Gnuastro.
-For those who are not familiar with it, we recommend the
@url{https://git-scm.com/book/en, ProGit book}.
-The whole book is publicly available for online reading and downloading and
does a wonderful job at explaining the concepts and best practices.
+To successfully run the bootstrapping process, there are some additional
dependencies to those discussed in the previous subsections.
+These are low level tools that are used by a large collection of Unix-like
operating systems programs, therefore they are most probably already available
in your system.
+If they are not already installed, you should be able to easily find them in
any GNU/Linux distribution package management system (@command{apt-get},
@command{yum}, @command{pacman}, etc.).
+The short names in parenthesis in @command{typewriter} font after the package
name can be used to search for them in your package manager.
+For the GNU Portability Library, GNU Autoconf Archive and @TeX{} Live, it is
recommended to use the instructions here, not your operating system's package
manager.
-Let's assume you want to keep Gnuastro in the @file{TOPGNUASTRO} directory
(can be any directory, change the value below).
-The full version controlled history of Gnuastro can be cloned in
@file{TOPGNUASTRO/gnuastro} by running the following commands@footnote{If your
internet connection is active, but Git complains about the network, it might be
due to your network setup not recognizing the Git protocol.
-In that case use the following URL which uses the HTTP protocol instead:
@command{http://git.sv.gnu.org/r/gnuastro.git}}:
+@table @asis
+
+@item GNU Portability Library (Gnulib)
+@cindex GNU C library
+@cindex Gnulib: GNU Portability Library
+@cindex GNU Portability Library (Gnulib)
+To ensure portability for a wider range of operating systems (those that do
not include GNU C library, namely glibc), Gnuastro depends on the GNU
portability library, or Gnulib.
+Gnulib keeps a copy of all the functions in glibc, implemented (as much as
possible) to be portable to other operating systems.
+The @file{bootstrap} script can automatically clone Gnulib (as a
@file{gnulib/} directory inside Gnuastro), however, as described in
@ref{Bootstrapping} this is not recommended.
+
+The recommended way to bootstrap Gnuastro is to first clone Gnulib and the
Autoconf archives (see below) into a local directory outside of Gnuastro.
+Let's call it @file{DEVDIR}@footnote{If you are not a developer in Gnulib or
Autoconf archives, @file{DEVDIR} can be a directory that you do not backup.
+In this way the large number of files in these projects will not slow down
your backup process or take bandwidth (if you backup to a remote server).}
(which you can set to any directory; preferentially where you keep your other
development projects).
+Currently in Gnuastro, both Gnulib and Autoconf archives have to be cloned in
the same top directory@footnote{If you already have the Autoconf archives in a
separate directory, or cannot clone it in the same directory as Gnulib, or you
have it with another directory name (not @file{autoconf-archive/}), you can
follow this short step.
+Set @file{AUTOCONFARCHIVES} to your desired address.
+Then define a symbolic link in @file{DEVDIR} with the following command so
Gnuastro's bootstrap script can find it:@*@command{$ ln -s $AUTOCONFARCHIVES
$DEVDIR/autoconf-archive}.} like the case here@footnote{If your internet
connection is active, but Git complains about the network, it might be due to
your network setup not recognizing the git protocol.
+In that case use the following URL for the HTTP protocol instead (for Autoconf
archives, replace the name): @command{http://git.sv.gnu.org/r/gnulib.git}}:
@example
-$ TOPGNUASTRO=/home/yourname/Research/projects/
-$ cd $TOPGNUASTRO
-$ git clone git://git.sv.gnu.org/gnuastro.git
+$ DEVDIR=/home/yourname/Development ## Select any location.
+$ mkdir $DEVDIR ## If it doesn't exist!
+$ cd $DEVDIR
+$ git clone https://git.sv.gnu.org/git/gnulib.git
+$ git clone https://git.sv.gnu.org/git/autoconf-archive.git
@end example
+Gnulib is a source-based dependency of Gnuastro's bootstrapping process, so
simply having it is enough on your computer, there is no need to install, and
thus check anything.
+
@noindent
-The @file{$TOPGNUASTRO/gnuastro} directory will contain hand-written (version
controlled) source code for Gnuastro's programs, libraries, this book and the
tests.
-All are divided into sub-directories with standard and very descriptive names.
-The version controlled files in the top cloned directory are either mainly in
capital letters (for example, @file{THANKS} and @file{README}) or mainly
written in small-caps (for example, @file{configure.ac} and @file{Makefile.am}).
-The former are non-programming, standard writing for human readers containing
high-level information about the whole package.
-The latter are instructions to customize the GNU build system for Gnuastro.
-For more on Gnuastro's source code structure, please see @ref{Developing}.
-We will not go any deeper here.
+You now have the full version controlled source of these two repositories in
separate directories.
+Both these packages are regularly updated, so every once in a while, you can
run @command{$ git pull} within them to get any possible updates.
-The cloned Gnuastro source cannot immediately be configured, compiled, or
installed since it only contains hand-written files, not automatically
generated or imported files which do all the hard work of the build process.
-See @ref{Bootstrapping} for the process of generating and importing those
files (it is not too hard!).
-Once you have bootstrapped Gnuastro, you can run the standard procedures (in
@ref{Quick start}).
-Very soon after you have cloned it, Gnuastro's main @file{master} branch will
be updated on the main repository (since the developers are actively working on
Gnuastro), for the best practices in keeping your local history in sync with
the main repository see @ref{Synchronizing}.
+@item GNU Automake (@command{automake})
+@cindex GNU Automake
+GNU Automake will build the @file{Makefile.in} files in each sub-directory
using the (hand-written) @file{Makefile.am} files.
+The @file{Makefile.in}s are subsequently used to generate the @file{Makefile}s
when the user runs @command{./configure} before building.
+To check that you have a working GNU Automake in your system, you can try this
command:
+@example
+$ automake --version
+@end example
+@item GNU Autoconf (@command{autoconf})
+@cindex GNU Autoconf
+GNU Autoconf will build the @file{configure} script using the configurations
we have defined (hand-written) in @file{configure.ac}.
+To check that you have a working GNU Autoconf in your system, you can try this
command:
-@menu
-* Bootstrapping:: Adding all the automatically generated files.
-* Synchronizing:: Keep your local clone up to date.
-@end menu
-
-@node Bootstrapping, Synchronizing, Version controlled source, Version
controlled source
-@subsubsection Bootstrapping
+@example
+$ autoconf --version
+@end example
-@cindex Bootstrapping
+@item GNU Autoconf Archive
@cindex GNU Autoconf Archive
-@cindex Gnulib: GNU Portability Library
-@cindex GNU Portability Library (Gnulib)
-@cindex Automatically created build files
-@noindent
-The version controlled source code lacks the source files that we have not
written or are automatically built.
-These automatically generated files are included in the distributed tarball
for each distribution (for example, @file{gnuastro-X.X.tar.gz}, see
@ref{Version numbering}) and make it easy to immediately configure, build, and
install Gnuastro.
-However from the perspective of version control, they are just bloatware and
sources of confusion (since they are not changed by Gnuastro developers).
+These are a large collection of tests that can be called to run at
@command{./configure} time.
+See the explanation under GNU Portability Library (Gnulib) above for
instructions on obtaining it and keeping it up to date.
-The process of automatically building and importing necessary files into the
cloned directory is known as @emph{bootstrapping}.
-After bootstrapping is done you are ready to follow the default GNU build
steps that you normally run on the tarball (@command{./configure && make} for
example, described more in @ref{Quick start}).
-Some known issues with bootstrapping may occur during the process, to see how
to fix them, please see @ref{Known issues}.
+GNU Autoconf Archive is a source-based dependency of Gnuastro's bootstrapping
process, so simply having it is enough on your computer, there is no need to
install, and thus check anything.
+Just do not forget that it has to be in the same directory as Gnulib
(described above).
-All the instructions for an automatic bootstrapping are available in
@file{bootstrap} and configured using @file{bootstrap.conf}.
-@file{bootstrap} and @file{COPYING} (which contains the software copyright
notice) are the only files not written by Gnuastro developers but under version
control to enable simple bootstrapping and legal information on usage
immediately after cloning.
-@file{bootstrap.conf} is maintained by the GNU Portability Library (Gnulib)
and this file is an identical copy, so do not make any changes in this file
since it will be replaced when Gnulib releases an update.
-Make all your changes in @file{bootstrap.conf}.
+@item GNU Texinfo (@command{texinfo})
+@cindex GNU Texinfo
+GNU Texinfo is the tool that formats this manual into the various output
formats.
+To bootstrap Gnuastro you need all of Texinfo's command-line programs.
+However, some operating systems package them separately, for example, in
Fedora, @command{makeinfo} is packaged in the @command{texinfo-tex} package.
-The bootstrapping process has its own separate set of dependencies, the full
list is given in @ref{Bootstrapping dependencies}.
-They are generally very low-level and used by a very large set of commonly
used programs, so they are probably already installed on your system.
-The simplest way to bootstrap Gnuastro is to simply run the bootstrap script
within your cloned Gnuastro directory as shown below.
-However, please read the next paragraph before doing so (see @ref{Version
controlled source} for @file{TOPGNUASTRO}).
+To check that you have a working GNU Texinfo in your system, you can try this
command:
@example
-$ cd TOPGNUASTRO/gnuastro
-$ ./bootstrap # Requires internet connection
+$ makeinfo --version
@end example
-Without any options, @file{bootstrap} will clone Gnulib within your cloned
Gnuastro directory (@file{TOPGNUASTRO/gnuastro/gnulib}) and download the
necessary Autoconf archives macros.
-So if you run bootstrap like this, you will need an internet connection every
time you decide to bootstrap.
-Also, Gnulib is a large package and cloning it can be slow.
-It will also keep the full Gnulib repository within your Gnuastro repository,
so if another one of your projects also needs Gnulib, and you insist on running
bootstrap like this, you will have two copies.
-In case you regularly backup your important files, Gnulib will also slow down
the backup process.
-Therefore while the simple invocation above can be used with no problem, it is
not recommended.
-To do better, see the next paragraph.
+@item GNU Libtool (@command{libtool})
+@cindex GNU Libtool
+GNU Libtool is in charge of building all the libraries in Gnuastro.
+The libraries contain functions that are used by more than one program and are
installed for use in other programs.
+They are thus put in a separate directory (@file{lib/}).
-The recommended way to get these two packages is thoroughly discussed in
@ref{Bootstrapping dependencies} (in short: clone them in the separate
@file{DEVDIR/} directory).
-The following commands will take you into the cloned Gnuastro directory and
run the @file{bootstrap} script, while telling it to copy some files (instead
of making symbolic links, with the @option{--copy} option, this is not
mandatory@footnote{The @option{--copy} option is recommended because some
backup systems might do strange things with symbolic links.}) and where to look
for Gnulib (with the @option{--gnulib-srcdir} option).
-Please note that the address given to @option{--gnulib-srcdir} has to be an
absolute address (so do not use @file{~} or @file{../} for example).
+To check that you have a working GNU Libtool in your system, you can try this
command (and from the output, make sure it is GNU's libtool)
@example
-$ cd $TOPGNUASTRO/gnuastro
-$ ./bootstrap --copy --gnulib-srcdir=$DEVDIR/gnulib
+$ libtool --version
@end example
-@cindex GNU Texinfo
-@cindex GNU Libtool
-@cindex GNU Autoconf
-@cindex GNU Automake
-@cindex GNU C library
-@cindex GNU build system
-Since Gnulib and Autoconf archives are now available in your local
directories, you do not need an internet connection every time you decide to
remove all un-tracked files and redo the bootstrap (see box below).
-You can also use the same command on any other project that uses Gnulib.
-All the necessary GNU C library functions, Autoconf macros and Automake inputs
are now available along with the book figures.
-The standard GNU build system (@ref{Quick start}) will do the rest of the job.
+@item GNU help2man (@command{help2man})
+@cindex GNU help2man
+GNU help2man is used to convert the output of the @option{--help} option
+(@ref{--help}) to the traditional Man page (@ref{Man pages}).
-@cartouche
-@noindent
-@strong{Undoing the bootstrap:}
-During the development, it might happen that you want to remove all the
automatically generated and imported files.
-In other words, you might want to reverse the bootstrap process.
-Fortunately Git has a good program for this job: @command{git clean}.
-Run the following command and every file that is not version controlled will
be removed.
+To check that you have a working GNU Help2man in your system, you can try this
command:
@example
-git clean -fxd
+$ help2man --version
@end example
-@noindent
-It is best to commit any recent change before running this command.
-You might have created new files since the last commit and if they have not
been committed, they will all be gone forever (using @command{rm}).
-To get a list of the non-version controlled files instead of deleting them,
add the @option{n} option to @command{git clean}, so it becomes @option{-fxdn}.
-@end cartouche
-
-Besides the @file{bootstrap} and @file{bootstrap.conf}, the
@file{bootstrapped/} directory and @file{README-hacking} file are also related
to the bootstrapping process.
-The former hosts all the imported (bootstrapped) directories.
-Thus, in the version controlled source, it only contains a @file{README} file,
but in the distributed tarball it also contains sub-directories filled with all
bootstrapped files.
-@file{README-hacking} contains a summary of the bootstrapping process
discussed in this section.
-It is a necessary reference when you have not built this book yet.
-It is thus not distributed in the Gnuastro tarball.
-
-@node Synchronizing, , Bootstrapping, Version controlled source
-@subsubsection Synchronizing
+@item @LaTeX{} and some @TeX{} packages
+@cindex @LaTeX{}
+@cindex @TeX{} Live
+Some of the figures in this book are built by @LaTeX{} (using the PGF/TikZ
package).
+The @LaTeX{} source for those figures is version controlled for easy
maintenance not the actual figures.
+So the @file{./boostrap} script will run @LaTeX{} to build the figures.
+The best way to install @LaTeX{} and all the necessary packages is through
@url{https://www.tug.org/texlive/, @TeX{} live} which is a package manager for
@TeX{} related tools that is independent of any operating system.
+It is thus preferred to the @TeX{} Live versions distributed by your operating
system.
-The bootstrapping script (see @ref{Bootstrapping}) is not regularly needed:
you mainly need it after you have cloned Gnuastro (once) and whenever you want
to re-import the files from Gnulib, or Autoconf
archives@footnote{@url{https://savannah.gnu.org/task/index.php?13993} is
defined for you to check if significant (for Gnuastro) updates are made in
these repositories, since the last time you pulled from them.} (not too common).
-However, Gnuastro developers are constantly working on Gnuastro and are
pushing their changes to the official repository.
-Therefore, your local Gnuastro clone will soon be out-dated.
-Gnuastro has two mailing lists dedicated to its developing activities (see
@ref{Developing mailing lists}).
-Subscribing to them can help you decide when to synchronize with the official
repository.
+To install @TeX{} Live, go to the web page and download the appropriate
installer by following the ``download'' link.
+Note that by default the full package repository will be downloaded and
installed (around 4 Gigabytes) which can take @emph{very} long to download and
to update later.
+However, most packages are not needed by everyone, it is easier, faster and
better to install only the ``Basic scheme'' (consisting of only the most basic
@TeX{} and @LaTeX{} packages, which is less than 200 Mega bytes)@footnote{You
can also download the DVD iso file at a later time to keep as a backup for when
you do not have internet connection if you need a package.}.
-To pull all the most recent work in Gnuastro, run the following command from
the top Gnuastro directory.
-If you do not already have a built system, ignore @command{make distclean}.
-The separate steps are described in detail afterwards.
+After the installation, be sure to set the environment variables as suggested
in the end of the outputs.
+Any time you confront (need) a package you do not have, simply install it with
a command like below (similar to how you install software from your operating
system's package manager)@footnote{After running @TeX{}, or @LaTeX{}, you might
get a warning complaining about a @file{missingfile}.
+Run `@command{tlmgr info missingfile}' to see the package(s) containing that
file which you can install.}.
+To install all the necessary @TeX{} packages for a successful Gnuastro
bootstrap, run this command:
@example
-$ make distclean && git pull && autoreconf -f
+$ sudo su
+# tlmgr install epsf jknapltx caption biblatex biber iftex \
+ etoolbox logreq xstring xkeyval pgf ms \
+ xcolor pgfplots times rsfs ps2eps epspdf
@end example
-@noindent
-You can also run the commands separately:
+To check that you have a working @LaTeX{} executable in your system, you can
try this command (this just checks if @LaTeX{} exists, as described above, if
you have a missing package, you can easily identify it from the output and
install it with @command{tlmgr}):
@example
-$ make distclean
-$ git pull
-$ autoreconf -f
+$ latex --version
@end example
-@cindex GNU Autoconf
-@cindex Mailing list: info-gnuastro
-@cindex @code{info-gnuastro@@gnu.org}
-If Gnuastro was already built in this directory, you do not want some outputs
from the previous version being mixed with outputs from the newly pulled work.
-Therefore, the first step is to clean/delete all the built files with
@command{make distclean}.
-Fortunately the GNU build system allows the separation of source and built
files (in separate directories).
-This is a great feature to keep your source directory clean and you can use it
to avoid the cleaning step.
-Gnuastro comes with a script with some useful options for this job.
-It is useful if you regularly pull recent changes, see @ref{Separate build and
source directories}.
-After the pull, we must re-configure Gnuastro with @command{autoreconf -f}
(part of GNU Autoconf).
-It will update the @file{./configure} script and all the
@file{Makefile.in}@footnote{In the GNU build system, @command{./configure} will
use the @file{Makefile.in} files to create the necessary @file{Makefile} files
that are later read by @command{make} to build the package.} files based on the
hand-written configurations (in @file{configure.ac} and the @file{Makefile.am}
files).
-After running @command{autoreconf -f}, a warning about @code{TEXI2DVI} might
show up, you can ignore that.
+@item ImageMagick (@command{imagemagick})
+@cindex ImageMagick
+ImageMagick is a wonderful and robust program for image manipulation on the
command-line.
+@file{bootstrap} uses it to convert the book images into the formats necessary
for the various book formats.
-The most important reason for re-building Gnuastro's build system is to
generate/update the version number for your updated Gnuastro snapshot.
-This generated version number will include the commit information (see
@ref{Version numbering}).
-The version number is included in nearly all outputs of Gnuastro's programs,
therefore it is vital for reproducing an old result.
+Since ImageMagick version 7, it is necessary to edit the policy file
(@file{/etc/ImageMagick-7/policy.xml}) to have the following line (it maybe
present, but commented, in this case un-comment it):
-As a summary, be sure to run `@command{autoreconf -f}' after every change in
the Git history.
-This includes synchronization with the main server or even a commit you have
made yourself.
+@example
+<policy domain="coder" rights="read|write" pattern="@{PS,PDF,XPS@}"/>
+@end example
-If you would like to see what has changed since you last synchronized your
local clone, you can take the following steps instead of the simple command
above (do not type anything after @code{#}):
+If the following line is present, it is also necessary to comment/remove it.
@example
-$ git checkout master # Confirm if you are on master.
-$ git fetch origin # Fetch all new commits from server.
-$ git log master..origin/master # See all the new commit messages.
-$ git merge origin/master # Update your master branch.
-$ autoreconf -f # Update the build system.
+<policy domain="delegate" rights="none" pattern="gs" />
@end example
-@noindent
-By default @command{git log} prints the most recent commit first, add the
@option{--reverse} option to see the changes chronologically.
-To see exactly what has been changed in the source code along with the commit
message, add a @option{-p} option to the @command{git log}.
+To learn more about the ImageMagick security policy please see:
@url{https://imagemagick.org/script/security-policy.php}.
-If you want to make changes in the code, have a look at @ref{Developing} to
get started easily.
-Be sure to commit your changes in a separate branch (keep your @code{master}
branch to follow the official repository) and re-run @command{autoreconf -f}
after the commit.
-If you intend to send your work to us, you can safely use your commit since it
will be ultimately recorded in Gnuastro's official history.
-If not, please upload your separate branch to a public hosting service, for
example, @url{https://codeberg.org, Codeberg}, and link to it in your
report/paper.
-Alternatively, run @command{make distcheck} and upload the output
@file{gnuastro-X.X.X.XXXX.tar.gz} to a publicly accessible web page so your
results can be considered scientific (reproducible) later.
+To check that you have a working ImageMagick in your system, you can try this
command:
+@example
+$ convert --version
+@end example
+@end table
+@node Dependencies from package managers, , Bootstrapping dependencies,
Dependencies
+@subsection Dependencies from package managers
+@cindex Package managers
+@cindex Source code building
+@cindex Building from source
+@cindex Compiling from source
+@cindex Source code compilation
+@cindex Distributions, GNU/Linux
+The most basic way to install a package on your system is to build the
packages from source yourself.
+Alternatively, you can use your operating system's package manager to download
pre-compiled files and install them.
+The latter choice is easier and faster.
+However, we recommend that you build the @ref{Mandatory dependencies} yourself
from source (all necessary commands and links are given in the respective
section).
+Here are some basic reasons behind this recommendation.
+@enumerate
+@item
+Your operating system's pre-built software might not be the most recent
release.
+For example, Gnuastro itself is also packaged in some package managers.
+For the list see: @url{https://repology.org/project/gnuastro/versions}.
+You will notice that Gnuastro's version in some operating systems is more than
10 versions old!
+It is the same for all the dependencies of Gnuastro.
+@item
+For each package, Gnuastro might preform better (or require) certain
configuration options that your distribution's package managers did not add for
you.
+If present, these configuration options are explained during the installation
of each in the sections below (for example, in @ref{CFITSIO}).
+When the proper configuration has not been set, the programs should complain
and inform you.
+@item
+For the libraries, they might separate the binary file from the header files
which can cause confusion, see @ref{Known issues}.
+@item
+Like any other tool, the science you derive from Gnuastro's tools highly
depend on these lower level dependencies, so generally it is much better to
have a close connection with them.
+By reading their manuals, installing them and staying up to date with
changes/bugs in them, your scientific results and understanding (of what is
going on, and thus how you interpret your scientific results) will also
correspondingly improve.
+@end enumerate
-@node Build and install, , Downloading the source, Installation
-@section Build and install
+Based on your package manager, you can use any of the following commands to
install the mandatory and optional dependencies.
+If your package manager is not included in the list below, please send us the
respective command, so we add it.
+For better archivability and compression ratios, Gnuastro's recommended
tarball compression format is with the @url{http://lzip.nongnu.org/lzip.html,
Lzip} program, see @ref{Release tarball}.
+Therefore, the package manager commands below also contain Lzip.
-This section is basically a longer explanation to the sequence of commands
given in @ref{Quick start}.
-If you did not have any problems during the @ref{Quick start} steps, you want
to have all the programs of Gnuastro installed in your system, you do not want
to change the executable names during or after installation, you have root
access to install the programs in the default system wide directory, the Letter
paper size of the print book is fine for you or as a summary you do not feel
like going into the details when everything is working, you can safely skip
this section.
+@table @asis
+@item @command{apt-get} (Debian-based OSs: Debian, Ubuntu, Linux Mint, etc.)
+@cindex Debian
+@cindex Ubuntu
+@cindex Linux Mint
+@cindex @command{apt-get}
+@cindex Advanced Packaging Tool (APT, Debian)
+@url{https://en.wikipedia.org/wiki/Debian,Debian} is one of the oldest
+GNU/Linux
+distributions@footnote{@url{https://en.wikipedia.org/wiki/List_of_Linux_distributions#Debian-based}}.
+It thus has a very extended user community and a robust internal structure and
standards.
+All of it is free software and based on the work of volunteers around the
world.
+Many distributions are thus derived from it, for example, Ubuntu and Linux
Mint.
+This arguably makes Debian-based OSs the largest, and most used, class of
GNU/Linux distributions.
+All of them use Debian's Advanced Packaging Tool (APT, for example,
@command{apt-get}) for managing packages.
-If you have any of the above problems or you want to understand the details
for a better control over your build and install, read along.
-The dependencies which you will need prior to configuring, building and
installing Gnuastro are explained in @ref{Dependencies}.
-The first three steps in @ref{Quick start} need no extra explanation, so we
will skip them and start with an explanation of Gnuastro specific configuration
options and a discussion on the installation directory in @ref{Configuring},
followed by some smaller subsections: @ref{Tests}, @ref{A4 print book}, and
@ref{Known issues} which explains the solutions to known problems you might
encounter in the installation steps and ways you can solve them.
+@table @asis
+@item Mandatory dependencies
+Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
+@example
+$ sudo apt-get install libgsl-dev libcfitsio-dev \
+ wcslib-dev
+@end example
+@item Optional dependencies
+If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
+@example
+$ sudo apt-get install ghostscript libtool-bin \
+ libjpeg-dev libtiff-dev
+ libgit2-dev curl lzip
+@end example
-@menu
-* Configuring:: Configure Gnuastro
-* Separate build and source directories:: Keeping derivate/build files
separate.
-* Tests:: Run tests to see if it is working.
-* A4 print book:: Customize the print book.
-* Known issues:: Issues you might encounter.
-@end menu
+@item Programs to view FITS images or tables
+These are not used in Gnuastro's build.
+They can just help in viewing the inputs/outputs independent of Gnuastro!
+@example
+$ sudo apt-get install saods9 topcat
+@end example
+@end table
+@noindent
+Gnuastro is @url{https://tracker.debian.org/pkg/gnuastro,packaged} in Debian
(and thus some of its derivate operating systems).
+Just make sure it is the most recent version.
+@item @command{dnf}
+@itemx @command{yum} (Red Hat-based OSs: Red Hat, Fedora, CentOS, Scientific
Linux, etc.)
+@cindex RHEL
+@cindex Fedora
+@cindex CentOS
+@cindex Red Hat
+@cindex @command{dnf}
+@cindex @command{yum}
+@cindex Scientific Linux
+@url{https://en.wikipedia.org/wiki/Red_Hat,Red Hat Enterprise Linux} (RHEL) is
released by Red Hat Inc.
+RHEL requires paid subscriptions for use of its binaries and support.
+But since it is free software, many other teams use its code to spin-off their
own distributions based on RHEL.
+Red Hat-based GNU/Linux distributions initially used the ``Yellowdog Updated,
Modifier'' (YUM) package manager, which has been replaced by ``Dandified yum''
(DNF).
+If the latter is not available on your system, you can use @command{yum}
instead of @command{dnf} in the command below.
+@table @asis
+@item Mandatory dependencies
+Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
+@example
+$ sudo dnf install gsl-devel cfitsio-devel \
+ wcslib-devel
+@end example
+@item Optional dependencies
+If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
+@example
+$ sudo dnf install ghostscript libtool \
+ libjpeg-devel libtiff-devel \
+ libgit2-devel lzip curl
+@end example
-@node Configuring, Separate build and source directories, Build and install,
Build and install
-@subsection Configuring
+@item Programs to view FITS images or tables
+These are not used in Gnuastro's build.
+They can just help in viewing the inputs/outputs independent of Gnuastro!
+@example
+$ sudo dnf install saods9 topcat
+@end example
+@end table
-@pindex ./configure
-@cindex Configuring
-The @command{$ ./configure} step is the most important step in the build and
install process.
-All the required packages, libraries, headers and environment variables are
checked in this step.
-The behaviors of make and make install can also be set through command-line
options to this command.
+@item @command{brew} (macOS)
+@cindex macOS
+@cindex Homebrew
+@cindex MacPorts
+@cindex @command{brew}
+@url{https://en.wikipedia.org/wiki/MacOS,macOS} is the operating system used
on Apple devices.
+macOS does not come with a package manager pre-installed, but several widely
used, third-party package managers exist, such as Homebrew or MacPorts.
+Both are free software.
+Currently we have only tested Gnuastro's installation with Homebrew as
described below.
+If not already installed, first obtain Homebrew by following the instructions
at @url{https://brew.sh}.
-@cindex Configure options
-@cindex Customizing installation
-@cindex Installation, customizing
-The configure script accepts various arguments and options which enable the
final user to highly customize whatever she is building.
-The options to configure are generally very similar to normal program options
explained in @ref{Arguments and options}.
-Similar to all GNU programs, you can get a full list of the options along with
a short explanation by running
+@table @asis
+@item Mandatory dependencies
+Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
+Homebrew manages packages in different `taps'.
+To install WCSLIB via Homebrew you will need to @command{tap} into
@command{brewsci/science} first (the tap may change in the future, but can be
found by calling @command{brew search wcslib}).
@example
-$ ./configure --help
+$ brew tap brewsci/science
+$ brew install wcslib gsl cfitsio
@end example
-@noindent
-@cindex GNU Autoconf
-A complete explanation is also included in the @file{INSTALL} file.
-Note that this file was written by the authors of GNU Autoconf (which builds
the @file{configure} script), therefore it is common for all programs which use
the @command{$ ./configure} script for building and installing, not just
Gnuastro.
-Here we only discuss cases where you do not have superuser access to the
system and if you want to change the executable names.
-But before that, a review of the options to configure that are particular to
Gnuastro are discussed.
+@item Optional dependencies
+If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
+@example
+$ brew install ghostscript libtool libjpeg \
+ libtiff libgit2 curl lzip
+@end example
-@menu
-* Gnuastro configure options:: Configure options particular to Gnuastro.
-* Installation directory:: Specify the directory to install.
-* Executable names:: Changing executable names.
-* Configure and build in RAM:: For minimal use of HDD or SSD, and clean
source.
-@end menu
+@item Programs to view FITS images or tables
+These are not used in Gnuastro's build.
+They can just help in viewing the inputs/outputs independent of Gnuastro!
+@example
+$ brew install saoimageds9 topcat
+@end example
+@end table
-@node Gnuastro configure options, Installation directory, Configuring,
Configuring
-@subsubsection Gnuastro configure options
+@item @command{pacman} (Arch Linux)
+@cindex Arch GNU/Linux
+@cindex @command{pacman}
+@url{https://en.wikipedia.org/wiki/Arch_Linux,Arch Linux} is a smaller
GNU/Linux distribution, which follows the KISS principle (``keep it simple,
stupid'') as a general guideline.
+It ``focuses on elegance, code correctness, minimalism and simplicity, and
expects the user to be willing to make some effort to understand the system's
operation''.
+Arch GNU/Linux uses ``Package manager'' (Pacman) to manage its
packages/components.
-@cindex @command{./configure} options
-@cindex Configure options particular to Gnuastro
-Most of the options to configure (which are to do with building) are similar
for every program which uses this script.
-Here the options that are particular to Gnuastro are discussed.
-The next topics explain the usage of other configure options which can be
applied to any program using the GNU build system (through the configure
script).
+@table @asis
+@item Mandatory dependencies
+Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
+@example
+$ sudo pacman -S gsl cfitsio wcslib
+@end example
-@vtable @option
+@item Optional dependencies
+If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
+@example
+$ sudo pacman -S ghostscript libtool libjpeg \
+ libtiff libgit2 curl lzip
+@end example
-@item --enable-debug
-@cindex Valgrind
-@cindex Debugging
-@cindex GNU Debugger
-Compile/build Gnuastro with debugging information, no optimization and without
shared libraries.
+@item Programs to view FITS images or tables
+These are not used in Gnuastro's build.
+They can just help in viewing the inputs/outputs independent of Gnuastro!
-In order to allow more efficient programs when using Gnuastro (after the
installation), by default Gnuastro is built with a 3rd level (a very high
level) optimization and no debugging information.
-By default, libraries are also built for static @emph{and} shared linking (see
@ref{Linking}).
-However, when there are crashes or unexpected behavior, these three features
can hinder the process of localizing the problem.
-This configuration option is identical to manually calling the configuration
script with @code{CFLAGS="-g -O0" --disable-shared}.
+SAO DS9 and TOPCAT are not available in the standard Arch GNU/Linux
repositories.
+However, installing and using both is very easy from their own web pages, as
described in @ref{SAO DS9} and @ref{TOPCAT}.
+@end table
+
+@item @command{zypper} (openSUSE and SUSE Linux Enterprise Server)
+@cindex openSUSE
+@cindex SUSE Linux Enterprise Server
+@cindex @command{zypper}, OpenSUSE package manager
+SUSE Linux Enterprise
Server@footnote{@url{https://www.suse.com/products/server}} (SLES) is the
commercial offering which shares code and tools.
+Many additional packages are offered in the Build
Service@footnote{@url{https://build.opensuse.org}}.
+openSUSE and SLES use @command{zypper} (cli) and YaST (GUI) for managing
repositories and packages.
+
+@table @asis
+@item Configuration
+When building Gnuastro, run the configure script with the following
@code{CPPFLAGS} environment variable:
-In the (rare) situations where you need to do your debugging on the shared
libraries, do not use this option.
-Instead run the configure script by explicitly setting @code{CFLAGS} like this:
@example
-$ ./configure CFLAGS="-g -O0"
+$ ./configure CPPFLAGS="-I/usr/include/cfitsio"
@end example
-@item --enable-check-with-valgrind
-@cindex Valgrind
-Do the @command{make check} tests through Valgrind.
-Therefore, if any crashes or memory-related issues (segmentation faults in
particular) occur in the tests, the output of Valgrind will also be put in the
@file{tests/test-suite.log} file without having to manually modify the check
scripts.
-This option will also activate Gnuastro's debug mode (see the
@option{--enable-debug} configure-time option described above).
+@item Mandatory dependencies
+Without these, Gnuastro cannot be built, they are necessary for input/output
and low-level mathematics (see @ref{Mandatory dependencies})!
+@example
+$ sudo zypper install gsl-devel cfitsio-devel \
+ wcslib-devel
+@end example
-Valgrind is free software.
-It is a program for easy checking of memory-related issues in programs.
-It runs a program within its own controlled environment and can thus identify
the exact line-number in the program's source where a memory-related issue
occurs.
-However, it can significantly slow-down the tests.
-So this option is only useful when a segmentation fault is found during
@command{make check}.
+@item Optional dependencies
+If present, these libraries can be used in Gnuastro's build for extra
features, see @ref{Optional dependencies}.
+@example
+$ sudo zypper install ghostscript_any libtool \
+ pkgconfig libcurl-devel \
+ libgit2-devel \
+ libjpeg62-devel \
+ libtiff-devel curl
+@end example
-@item --enable-progname
-Only build and install @file{progname} along with any other program that is
enabled in this fashion.
-@file{progname} is the name of the executable without the @file{ast}, for
example, @file{crop} for Crop (with the executable name of @file{astcrop}).
+@item Programs to view FITS images or tables
+These are not used in Gnuastro's build.
+They can just help in viewing the inputs/outputs independent of Gnuastro!
+@example
+$ sudo zypper install ds9 topcat
+@end example
+@end table
-Note that by default all the programs will be installed.
-This option (and the @option{--disable-progname} options) are only relevant
when you do not want to install all the programs.
-Therefore, if this option is called for any of the programs in Gnuastro, any
program which is not explicitly enabled will not be built or installed.
+@c Gnuastro is @url{https://software.opensuse.org/package/gnuastro,packaged}
+@c in @command{zypper}. Just make sure it is the most recent version.
+@end table
-@item --disable-progname
-@itemx --enable-progname=no
-Do not build or install the program named @file{progname}.
-This is very similar to the @option{--enable-progname}, but will build and
install all the other programs except this one.
+Usually, when libraries are installed by operating system package managers,
there should be no problems when configuring and building other programs from
source (that depend on the libraries: Gnuastro in this case).
+However, in some special conditions, problems may pop-up during the
configuration, building, or checking/running any of Gnuastro's programs.
+The most common of such problems and their solution are discussed below.
@cartouche
@noindent
-@strong{Note:} If some programs are enabled and some are disabled, it is
equivalent to simply enabling those that were enabled.
-Listing the disabled programs is redundant.
+@strong{Not finding library during configuration:} If a library is installed,
but during Gnuastro's @command{configure} step the library is not found, then
configure Gnuastro like the command below (correcting @file{/path/to/lib}).
+For more, see @ref{Known issues} and @ref{Installation directory}.
+@example
+$ ./configure LDFLAGS="-L/path/to/lib"
+@end example
@end cartouche
-@item --enable-gnulibcheck
-@cindex GNU C library
-@cindex Gnulib: GNU Portability Library
-@cindex GNU Portability Library (Gnulib)
-Enable checks on the GNU Portability Library (Gnulib).
-Gnulib is used by Gnuastro to enable users of non-GNU based operating systems
(that do not use GNU C library or glibc) to compile and use the advanced
features that this library provides.
-We make extensive use of such functions.
-If you give this option to @command{$ ./configure}, when you run @command{$
make check}, first the functions in Gnulib will be tested, then the Gnuastro
executables.
-If your operating system does not support glibc or has an older version of it
and you have problems in the build process (@command{$ make}), you can give
this flag to configure to see if the problem is caused by Gnulib not supporting
your operating system or Gnuastro, see @ref{Known issues}.
+@cartouche
+@noindent
+@strong{Not finding header (.h) files while building:} If a library is
installed, but during Gnuastro's @command{make} step, the library's header
(file with a @file{.h} suffix) is not found, then configure Gnuastro like the
command below (correcting @file{/path/to/include}).
+For more, see @ref{Known issues} and @ref{Installation directory}.
+@example
+$ ./configure CPPFLAGS="-I/path/to/include"
+@end example
+@end cartouche
-@item --disable-guide-message
-@itemx --enable-guide-message=no
-Do not print a guiding message during the GNU Build process of @ref{Quick
start}.
-By default, after each step, a message is printed guiding the user what the
next command should be.
-Therefore, after @command{./configure}, it will suggest running @command{make}.
-After @command{make}, it will suggest running @command{make check} and so on.
-If Gnuastro is configured with this option, for example
+@cartouche
+@noindent
+@strong{Gnuastro's programs do not run during check or after install:}
+If a library is installed, but the programs do not run due to linking
problems, set the @code{LD_LIBRARY_PATH} variable like below (assuming Gnuastro
is installed in @file{/path/to/installed}).
+For more, see @ref{Known issues} and @ref{Installation directory}.
@example
-$ ./configure --disable-guide-message
+$ export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/path/to/installed/lib"
@end example
-Then these messages will not be printed after any step (like most programs).
-For people who are not yet fully accustomed to this build system, these
guidelines can be very useful and encouraging.
-However, if you find those messages annoying, use this option.
+@end cartouche
-@item --without-libgit2
-@cindex Git
-@pindex libgit2
-@cindex Version control systems
-Build Gnuastro without libgit2 (for including Git commit hashes in output
files), see @ref{Optional dependencies}.
-libgit2 is an optional dependency, with this option, Gnuastro will ignore any
possibly existing libgit2 that may already be on the system.
-
-@item --without-libjpeg
-@pindex libjpeg
-@cindex JPEG format
-Build Gnuastro without libjpeg (for reading/writing to JPEG files), see
@ref{Optional dependencies}.
-libjpeg is an optional dependency, with this option, Gnuastro will ignore any
possibly existing libjpeg that may already be on the system.
-
-@item --without-libtiff
-@pindex libtiff
-@cindex TIFF format
-Build Gnuastro without libtiff (for reading/writing to TIFF files), see
@ref{Optional dependencies}.
-libtiff is an optional dependency, with this option, Gnuastro will ignore any
possibly existing libtiff that may already be on the system.
-@item --with-python
-@cindex PyPI
-@cindex Python
-Build the Python interface within Gnuastro's dynamic library.
-This interface can be used for easy communication with Python wrappers (for
example, the pyGnuastro package).
-When you install the pyGnuastro package from PyPI, the correct configuration
of the Gnuastro Library is already packaged with it (with the Python interface)
and that is independent of your Gnuastro installation.
-The Python interface is only necessary if you want to build pyGnuastro from
source (which is only necessary for developers).
-Therefore it has to be explicitly activated at configure time with this option.
-For more on the interface functions, see @ref{Python interface}.
-@end vtable
-The tests of some programs might depend on the outputs of the tests of other
programs.
-For example, MakeProfiles is one the first programs to be tested when you run
@command{$ make check}.
-MakeProfiles' test outputs (FITS images) are inputs to many other programs
(which in turn provide inputs for other programs).
-Therefore, if you do not install MakeProfiles for example, the tests for many
the other programs will be skipped.
-To avoid this, in one run, you can install all the programs and run the tests
but not install.
-If everything is working correctly, you can run configure again with only the
programs you want.
-However, do not run the tests and directly install after building.
-@node Installation directory, Executable names, Gnuastro configure options,
Configuring
-@subsubsection Installation directory
-@vindex --prefix
-@cindex Superuser, not possible
-@cindex Root access, not possible
-@cindex No access to superuser install
-@cindex Install with no superuser access
-One of the most commonly used options to @file{./configure} is
@option{--prefix}, it is used to define the directory that will host all the
installed files (or the ``prefix'' in their final absolute file name).
-For example, when you are using a server and you do not have administrator or
root access.
-In this example scenario, if you do not use the @option{--prefix} option, you
will not be able to install the built files and thus access them from anywhere
without having to worry about where they are installed.
-However, once you prepare your startup file to look into the proper place (as
discussed thoroughly below), you will be able to easily use this option and
benefit from any software you want to install without having to ask the system
administrators or install and use a different version of a software that is
already installed on the server.
+@node Downloading the source, Build and install, Dependencies, Installation
+@section Downloading the source
-The most basic way to run an executable is to explicitly write its full file
name (including all the directory information) and run it.
-One example is running the configuration script with the @command{$
./configure} command (see @ref{Quick start}).
-By giving a specific directory (the current directory or @file{./}), we are
explicitly telling the shell to look in the current directory for an executable
file named `@file{configure}'.
-Directly specifying the directory is thus useful for executables in the
current (or nearby) directories.
-However, when the program (an executable file) is to be used a lot, specifying
all those directories will become a significant burden.
-For example, the @file{ls} executable lists the contents in a given directory
and it is (usually) installed in the @file{/usr/bin/} directory by the
operating system maintainers.
-Therefore, if using the full address was the only way to access an executable,
each time you wanted a listing of a directory, you would have to run the
following command (which is very inconvenient, both in writing and in
remembering the various directories).
+Gnuastro's source code can be downloaded in two ways.
+As a tarball, ready to be configured and installed on your system (as
described in @ref{Quick start}), see @ref{Release tarball}.
+If you want official releases of stable versions this is the best, easiest and
most common option.
+Alternatively, you can clone the version controlled history of Gnuastro, run
one extra bootstrapping step and then follow the same steps as the tarball.
+This will give you access to all the most recent work that will be included in
the next release along with the full project history.
+The process is thoroughly introduced in @ref{Version controlled source}.
-@example
-$ /usr/bin/ls
-@end example
-@cindex Shell variables
-@cindex Environment variables
-To address this problem, we have the @file{PATH} environment variable.
-To understand it better, we will start with a short introduction to the shell
variables.
-Shell variable values are basically treated as strings of characters.
-For example, it does not matter if the value is a name (string of
@emph{alphabetic} characters), or a number (string of @emph{numeric}
characters), or both.
-You can define a variable and a value for it by running
-@example
-$ myvariable1=a_test_value
-$ myvariable2="a test value"
-@end example
-@noindent
-As you see above, if the value contains white space characters, you have to
put the whole value (including white space characters) in double quotes
(@key{"}).
-You can see the value it represents by running
-@example
-$ echo $myvariable1
-$ echo $myvariable2
-@end example
-@noindent
-@cindex Environment
-@cindex Environment variables
-If a variable has no value or it was not defined, the last command will only
print an empty line.
-A variable defined like this will be known as long as this shell or terminal
is running.
-Other terminals will have no idea it existed.
-The main advantage of shell variables is that if they are exported@footnote{By
running @command{$ export myvariable=a_test_value} instead of the simpler case
in the text}, subsequent programs that are run within that shell can access
their value.
-So by changing their value, you can change the ``environment'' of a program
which uses them.
-The shell variables which are accessed by programs are therefore known as
``environment variables''@footnote{You can use shell variables for other
actions too, for example, to temporarily keep some names or run loops on some
files.}.
-You can see the full list of exported variables that your shell recognizes by
running:
-@example
-$ printenv
-@end example
+@menu
+* Release tarball:: Download a stable official release.
+* Version controlled source:: Get and use the version controlled source.
+@end menu
-@cindex @file{HOME}
-@cindex @file{HOME/.local/}
-@cindex Environment variable, @code{HOME}
-@file{HOME} is one commonly used environment variable, it is any user's (the
one that is logged in) top directory.
-Try finding it in the command above.
-It is used so often that the shell has a special expansion (alternative) for
it: `@file{~}'.
-Whenever you see file names starting with the tilde sign, it actually
represents the value to the @file{HOME} environment variable, so @file{~/doc}
is the same as @file{$HOME/doc}.
+@node Release tarball, Version controlled source, Downloading the source,
Downloading the source
+@subsection Release tarball
-@vindex PATH
-@pindex ./configure
-@cindex Setting @code{PATH}
-@cindex Default executable search directory
-@cindex Search directory for executables
-Another one of the most commonly used environment variables is @file{PATH}, it
is a list of directories to search for executable names.
-Its value is a list of directories (separated by a colon, or `@key{:}').
-When the address of the executable is not explicitly given (like
@file{./configure} above), the system will look for the executable in the
directories specified by @file{PATH}.
-If you have a computer nearby, try running the following command to see which
directories your system will look into when it is searching for executable
(binary) files, one example is printed here (notice how @file{/usr/bin}, in the
@file{ls} example above, is one of the directories in @command{PATH}):
+A release tarball (commonly compressed) is the most common way of obtaining
free and open source software.
+A tarball is a snapshot of one particular moment in the Gnuastro development
history along with all the necessary files to configure, build, and install
Gnuastro easily (see @ref{Quick start}).
+It is very straightforward and needs the least set of dependencies (see
@ref{Mandatory dependencies}).
+Gnuastro has tarballs for official stable releases and pre-releases for
testing.
+See @ref{Version numbering} for more on the two types of releases and the
formats of the version numbers.
+The URLs for each type of release are given below.
-@example
-$ echo $PATH
-/usr/local/sbin:/usr/local/bin:/usr/bin
-@end example
+@table @asis
-By default @file{PATH} usually contains system-wide directories, which are
readable (but not writable) by all users, like the above example.
-Therefore if you do not have root (or administrator) access, you need to add
another directory to @file{PATH} which you actually have write access to.
-The standard directory where you can keep installed files (not just
executables) for your own user is the @file{~/.local/} directory.
-The names of hidden files start with a `@key{.}' (dot), so it will not show up
in your common command-line listings, or on the graphical user interface.
-You can use any other directory, but this is the most recognized.
+@item Official stable releases (@url{http://ftp.gnu.org/gnu/gnuastro}):
+This URL hosts the official stable releases of Gnuastro.
+Always use the most recent version (see @ref{Version numbering}).
+By clicking on the ``Last modified'' title of the second column, the files
will be sorted by their date which you can also use to find the latest version.
+It is recommended to use a mirror to download these tarballs, please visit
@url{http://ftpmirror.gnu.org/gnuastro/} and see below.
-The top installation directory will be used to keep all the package's
components: programs (executables), libraries, include (header) files, shared
data (like manuals), or configuration files (see @ref{Review of library
fundamentals} for a thorough introduction to headers and linking).
-So it commonly has some of the following sub-directories for each class of
installed components respectively: @file{bin/}, @file{lib/}, @file{include/}
@file{man/}, @file{share/}, @file{etc/}.
-Since the @file{PATH} variable is only used for executables, you can add the
@file{~/.local/bin} directory (which keeps the executables/programs or more
generally, ``binary'' files) to @file{PATH} with the following command.
-As defined below, first the existing value of @file{PATH} is used, then your
given directory is added to its end and the combined value is put back in
@file{PATH} (run `@command{$ echo $PATH}' afterwards to check if it was added).
+@item Pre-release tarballs (@url{http://alpha.gnu.org/gnu/gnuastro}):
+This URL contains unofficial pre-release versions of Gnuastro.
+The pre-release versions of Gnuastro here are for enthusiasts to try out
before an official release.
+If there are problems, or bugs then the testers will inform the developers to
fix before the next official release.
+See @ref{Version numbering} to understand how the version numbers here are
formatted.
+If you want to remain even more up-to-date with the developing activities,
please clone the version controlled source as described in @ref{Version
controlled source}.
-@example
-$ PATH=$PATH:~/.local/bin
-@end example
+@end table
-@cindex GNU Bash
-@cindex Startup scripts
-@cindex Scripts, startup
-Any executable that you installed in @file{~/.local/bin} will now be usable
without having to remember and write its full address.
-However, as soon as you leave/close your current terminal session, this
modified @file{PATH} variable will be forgotten.
-Adding the directories which contain executables to the @file{PATH}
environment variable each time you start a terminal is also very inconvenient
and prone to errors.
-Fortunately, there are standard `startup files' defined by your shell
precisely for this (and other) purposes.
-There is a special startup file for every significant starting step:
+@cindex Gzip
+@cindex Lzip
+Gnuastro's official/stable tarball is released with two formats: Gzip (with
suffix @file{.tar.gz}) and Lzip (with suffix @file{.tar.lz}).
+The pre-release tarballs (after version 0.3) are released only as an Lzip
tarball.
+Gzip is a very well-known and widely used compression program created by GNU
and available in most systems.
+However, Lzip provides a better compression ratio and more robust archival
capacity.
+For example, Gnuastro 0.3's tarball was 2.9MB and 4.3MB with Lzip and Gzip
respectively, see the @url{http://www.nongnu.org/lzip/lzip.html, Lzip web page}
for more.
+Lzip might not be pre-installed in your operating system, if so, installing it
from your operating system's package manager or from source is very easy and
fast (it is a very small program).
-@table @asis
+The GNU FTP server is mirrored (has backups) in various locations on the globe
(@url{http://www.gnu.org/order/ftp.html}).
+You can use the closest mirror to your location for a more faster download.
+Note that only some mirrors keep track of the pre-release (alpha) tarballs.
+Also note that if you want to download immediately after and announcement (see
@ref{Announcements}), the mirrors might need some time to synchronize with the
main GNU FTP server.
-@cindex GNU Bash
-@item @file{/etc/profile} and everything in @file{/etc/profile.d/}
-These startup scripts are called when your whole system starts (for example,
after you turn on your computer).
-Therefore you need administrator or root privileges to access or modify them.
-@item @file{~/.bash_profile}
-If you are using (GNU) Bash as your shell, the commands in this file are run,
when you log in to your account @emph{through Bash}.
-Most commonly when you login through the virtual console (where there is no
graphic user interface).
+@node Version controlled source, , Release tarball, Downloading the source
+@subsection Version controlled source
-@item @file{~/.bashrc}
-If you are using (GNU) Bash as your shell, the commands here will be run each
time you start a terminal and are already logged in.
-For example, when you open your terminal emulator in the graphic user
interface.
+@cindex Git
+@cindex Version control
+The publicly distributed Gnuastro tarball (for example,
@file{gnuastro-X.X.tar.gz}) does not contain the revision history, it is only a
snapshot of the source code at one significant instant of Gnuastro's history
(specified by the version number, see @ref{Version numbering}), ready to be
configured and built.
+To be able to develop successfully, the revision history of the code can be
very useful to track when something was added or changed, also some updates
that are not yet officially released might be in it.
-@end table
+We use Git for the version control of Gnuastro.
+For those who are not familiar with it, we recommend the
@url{https://git-scm.com/book/en, ProGit book}.
+The whole book is publicly available for online reading and downloading and
does a wonderful job at explaining the concepts and best practices.
-For security reasons, it is highly recommended to directly type in your
@file{HOME} directory value by hand in startup files instead of using variables.
-So in the following, let's assume your user name is `@file{name}' (so @file{~}
may be replaced with @file{/home/name}).
-To add @file{~/.local/bin} to your @file{PATH} automatically on any startup
file, you have to ``export'' the new value of @command{PATH} in the startup
file that is most relevant to you by adding this line:
+Let's assume you want to keep Gnuastro in the @file{TOPGNUASTRO} directory
(can be any directory, change the value below).
+The full version controlled history of Gnuastro can be cloned in
@file{TOPGNUASTRO/gnuastro} by running the following commands@footnote{If your
internet connection is active, but Git complains about the network, it might be
due to your network setup not recognizing the Git protocol.
+In that case use the following URL which uses the HTTP protocol instead:
@command{http://git.sv.gnu.org/r/gnuastro.git}}:
@example
-export PATH=$PATH:/home/name/.local/bin
+$ TOPGNUASTRO=/home/yourname/Research/projects/
+$ cd $TOPGNUASTRO
+$ git clone git://git.sv.gnu.org/gnuastro.git
@end example
-@cindex GNU build system
-@cindex Install directory
-@cindex Directory, install
-Now that you know your system will look into @file{~/.local/bin} for
executables, you can tell Gnuastro's configure script to install everything in
the top @file{~/.local} directory using the @option{--prefix} option.
-When you subsequently run @command{$ make install}, all the install-able files
will be put in their respective directory under @file{~/.local/} (the
executables in @file{~/.local/bin}, the compiled library files in
@file{~/.local/lib}, the library header files in @file{~/.local/include} and so
on, to learn more about these different files, please see @ref{Review of
library fundamentals}).
-Note that tilde (`@key{~}') expansion will not happen if you put a `@key{=}'
between @option{--prefix} and @file{~/.local}@footnote{If you insist on using
`@key{=}', you can use @option{--prefix=$HOME/.local}.}, so we have avoided the
@key{=} character here which is optional in GNU-style options, see
@ref{Options}.
+@noindent
+The @file{$TOPGNUASTRO/gnuastro} directory will contain hand-written (version
controlled) source code for Gnuastro's programs, libraries, this book and the
tests.
+All are divided into sub-directories with standard and very descriptive names.
+The version controlled files in the top cloned directory are either mainly in
capital letters (for example, @file{THANKS} and @file{README}) or mainly
written in small-caps (for example, @file{configure.ac} and @file{Makefile.am}).
+The former are non-programming, standard writing for human readers containing
high-level information about the whole package.
+The latter are instructions to customize the GNU build system for Gnuastro.
+For more on Gnuastro's source code structure, please see @ref{Developing}.
+We will not go any deeper here.
-@example
-$ ./configure --prefix ~/.local
-@end example
+The cloned Gnuastro source cannot immediately be configured, compiled, or
installed since it only contains hand-written files, not automatically
generated or imported files which do all the hard work of the build process.
+See @ref{Bootstrapping} for the process of generating and importing those
files (it is not too hard!).
+Once you have bootstrapped Gnuastro, you can run the standard procedures (in
@ref{Quick start}).
+Very soon after you have cloned it, Gnuastro's main @file{master} branch will
be updated on the main repository (since the developers are actively working on
Gnuastro), for the best practices in keeping your local history in sync with
the main repository see @ref{Synchronizing}.
-@cindex @file{MANPATH}
-@cindex @file{INFOPATH}
-@cindex @file{LD_LIBRARY_PATH}
-@cindex Library search directory
-@cindex Default library search directory
-You can install everything (including libraries like GSL, CFITSIO, or WCSLIB
which are Gnuastro's mandatory dependencies, see @ref{Mandatory dependencies})
locally by configuring them as above.
-However, recall that @command{PATH} is only for executable files, not
libraries and that libraries can also depend on other libraries.
-For example, WCSLIB depends on CFITSIO and Gnuastro needs both.
-Therefore, when you installed a library in a non-recognized directory, you
have to guide the program that depends on them to look into the necessary
library and header file directories.
-To do that, you have to define the @command{LDFLAGS} and @command{CPPFLAGS}
environment variables respectively.
-This can be done while calling @file{./configure} as shown below:
-@example
-$ ./configure LDFLAGS=-L/home/name/.local/lib \
- CPPFLAGS=-I/home/name/.local/include \
- --prefix ~/.local
-@end example
-It can be annoying/buggy to do this when configuring every software that
depends on such libraries.
-Hence, you can define these two variables in the most relevant startup file
(discussed above).
-The convention on using these variables does not include a colon to separate
values (as @command{PATH}-like variables do).
-They use white space characters and each value is prefixed with a compiler
option@footnote{These variables are ultimately used as options while building
the programs.
-Therefore every value has be an option name followed be a value as discussed
in @ref{Options}.}.
-Note the @option{-L} and @option{-I} above (see @ref{Options}), for
@option{-I} see @ref{Headers}, and for @option{-L}, see @ref{Linking}.
-Therefore we have to keep the value in double quotation signs to keep the
white space characters and adding the following two lines to the startup file
of choice:
-@example
-export LDFLAGS="$LDFLAGS -L/home/name/.local/lib"
-export CPPFLAGS="$CPPFLAGS -I/home/name/.local/include"
-@end example
-@cindex Dynamic libraries
-Dynamic libraries are linked to the executable every time you run a program
that depends on them (see @ref{Linking} to fully understand this important
concept).
-Hence dynamic libraries also require a special path variable called
@command{LD_LIBRARY_PATH} (same formatting as @command{PATH}).
-To use programs that depend on these libraries, you need to add
@file{~/.local/lib} to your @command{LD_LIBRARY_PATH} environment variable by
adding the following line to the relevant start-up file:
+@menu
+* Bootstrapping:: Adding all the automatically generated files.
+* Synchronizing:: Keep your local clone up to date.
+@end menu
+
+@node Bootstrapping, Synchronizing, Version controlled source, Version
controlled source
+@subsubsection Bootstrapping
+
+@cindex Bootstrapping
+@cindex GNU Autoconf Archive
+@cindex Gnulib: GNU Portability Library
+@cindex GNU Portability Library (Gnulib)
+@cindex Automatically created build files
+@noindent
+The version controlled source code lacks the source files that we have not
written or are automatically built.
+These automatically generated files are included in the distributed tarball
for each distribution (for example, @file{gnuastro-X.X.tar.gz}, see
@ref{Version numbering}) and make it easy to immediately configure, build, and
install Gnuastro.
+However from the perspective of version control, they are just bloatware and
sources of confusion (since they are not changed by Gnuastro developers).
+
+The process of automatically building and importing necessary files into the
cloned directory is known as @emph{bootstrapping}.
+After bootstrapping is done you are ready to follow the default GNU build
steps that you normally run on the tarball (@command{./configure && make} for
example, described more in @ref{Quick start}).
+Some known issues with bootstrapping may occur during the process, to see how
to fix them, please see @ref{Known issues}.
+
+All the instructions for an automatic bootstrapping are available in
@file{bootstrap} and configured using @file{bootstrap.conf}.
+@file{bootstrap} and @file{COPYING} (which contains the software copyright
notice) are the only files not written by Gnuastro developers but under version
control to enable simple bootstrapping and legal information on usage
immediately after cloning.
+@file{bootstrap.conf} is maintained by the GNU Portability Library (Gnulib)
and this file is an identical copy, so do not make any changes in this file
since it will be replaced when Gnulib releases an update.
+Make all your changes in @file{bootstrap.conf}.
+
+The bootstrapping process has its own separate set of dependencies, the full
list is given in @ref{Bootstrapping dependencies}.
+They are generally very low-level and used by a very large set of commonly
used programs, so they are probably already installed on your system.
+The simplest way to bootstrap Gnuastro is to simply run the bootstrap script
within your cloned Gnuastro directory as shown below.
+However, please read the next paragraph before doing so (see @ref{Version
controlled source} for @file{TOPGNUASTRO}).
@example
-export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/name/.local/lib
+$ cd TOPGNUASTRO/gnuastro
+$ ./bootstrap # Requires internet connection
@end example
-If you also want to access the Info (see @ref{Info}) and man pages (see
@ref{Man pages}) documentations add @file{~/.local/share/info} and
@file{~/.local/share/man} to your @command{INFOPATH}@footnote{Info has the
following convention: ``If the value of @command{INFOPATH} ends with a colon
[or it is not defined] ..., the initial list of directories is constructed by
appending the build-time default to the value of @command{INFOPATH}.''
-So when installing in a non-standard directory and if @command{INFOPATH} was
not initially defined, add a colon to the end of @command{INFOPATH} as shown
below.
-Otherwise Info will not be able to find system-wide installed documentation:
-@*@command{echo 'export INFOPATH=$INFOPATH:/home/name/.local/share/info:' >>
~/.bashrc}@*
-Note that this is only an internal convention of Info: do not use it for other
@command{*PATH} variables.} and @command{MANPATH} environment variables
respectively.
+Without any options, @file{bootstrap} will clone Gnulib within your cloned
Gnuastro directory (@file{TOPGNUASTRO/gnuastro/gnulib}) and download the
necessary Autoconf archives macros.
+So if you run bootstrap like this, you will need an internet connection every
time you decide to bootstrap.
+Also, Gnulib is a large package and cloning it can be slow.
+It will also keep the full Gnulib repository within your Gnuastro repository,
so if another one of your projects also needs Gnulib, and you insist on running
bootstrap like this, you will have two copies.
+In case you regularly backup your important files, Gnulib will also slow down
the backup process.
+Therefore while the simple invocation above can be used with no problem, it is
not recommended.
+To do better, see the next paragraph.
-@cindex Search directory order
-@cindex Order in search directory
-A final note is that order matters in the directories that are searched for
all the variables discussed above.
-In the examples above, the new directory was added after the system specified
directories.
-So if the program, library or manuals are found in the system wide
directories, the user directory is no longer searched.
-If you want to search your local installation first, put the new directory
before the already existing list, like the example below.
+The recommended way to get these two packages is thoroughly discussed in
@ref{Bootstrapping dependencies} (in short: clone them in the separate
@file{DEVDIR/} directory).
+The following commands will take you into the cloned Gnuastro directory and
run the @file{bootstrap} script, while telling it to copy some files (instead
of making symbolic links, with the @option{--copy} option, this is not
mandatory@footnote{The @option{--copy} option is recommended because some
backup systems might do strange things with symbolic links.}) and where to look
for Gnulib (with the @option{--gnulib-srcdir} option).
+Please note that the address given to @option{--gnulib-srcdir} has to be an
absolute address (so do not use @file{~} or @file{../} for example).
@example
-export LD_LIBRARY_PATH=/home/name/.local/lib:$LD_LIBRARY_PATH
+$ cd $TOPGNUASTRO/gnuastro
+$ ./bootstrap --copy --gnulib-srcdir=$DEVDIR/gnulib
@end example
-@noindent
-This is good when a library, for example, CFITSIO, is already present on the
system, but the system-wide install was not configured with the correct
configuration flags (see @ref{CFITSIO}), or you want to use a newer version and
you do not have administrator or root access to update it on the whole
system/server.
-If you update @file{LD_LIBRARY_PATH} by placing @file{~/.local/lib} first
(like above), the linker will first find the CFITSIO you installed for yourself
and link with it.
-It thus will never reach the system-wide installation.
-
-There are important security problems with using local installations first:
all important system-wide executables and libraries (important executables like
@command{ls} and @command{cp}, or libraries like the C library) can be replaced
by non-secure versions with the same file names and put in the customized
directory (@file{~/.local} in this example).
-So if you choose to search in your customized directory first, please @emph{be
sure} to keep it clean from executables or libraries with the same names as
important system programs or libraries.
+@cindex GNU Texinfo
+@cindex GNU Libtool
+@cindex GNU Autoconf
+@cindex GNU Automake
+@cindex GNU C library
+@cindex GNU build system
+Since Gnulib and Autoconf archives are now available in your local
directories, you do not need an internet connection every time you decide to
remove all un-tracked files and redo the bootstrap (see box below).
+You can also use the same command on any other project that uses Gnulib.
+All the necessary GNU C library functions, Autoconf macros and Automake inputs
are now available along with the book figures.
+The standard GNU build system (@ref{Quick start}) will do the rest of the job.
@cartouche
@noindent
-@strong{Summary:} When you are using a server which does not give you
administrator/root access AND you would like to give priority to your own built
programs and libraries, not the version that is (possibly already) present on
the server, add these lines to your startup file.
-See above for which startup file is best for your case and for a detailed
explanation on each.
-Do Not forget to replace `@file{/YOUR-HOME-DIR}' with your home directory (for
example, `@file{/home/your-id}'):
+@strong{Undoing the bootstrap:}
+During the development, it might happen that you want to remove all the
automatically generated and imported files.
+In other words, you might want to reverse the bootstrap process.
+Fortunately Git has a good program for this job: @command{git clean}.
+Run the following command and every file that is not version controlled will
be removed.
@example
-export PATH="/YOUR-HOME-DIR/.local/bin:$PATH"
-export LDFLAGS="-L/YOUR-HOME-DIR/.local/lib $LDFLAGS"
-export MANPATH="/YOUR-HOME-DIR/.local/share/man/:$MANPATH"
-export CPPFLAGS="-I/YOUR-HOME-DIR/.local/include $CPPFLAGS"
-export INFOPATH="/YOUR-HOME-DIR/.local/share/info/:$INFOPATH"
-export LD_LIBRARY_PATH="/YOUR-HOME-DIR/.local/lib:$LD_LIBRARY_PATH"
+git clean -fxd
@end example
@noindent
-Afterwards, you just need to add an extra
@option{--prefix=/YOUR-HOME-DIR/.local} to the @file{./configure} command of
the software that you intend to install.
-Everything else will be the same as a standard build and install, see
@ref{Quick start}.
+It is best to commit any recent change before running this command.
+You might have created new files since the last commit and if they have not
been committed, they will all be gone forever (using @command{rm}).
+To get a list of the non-version controlled files instead of deleting them,
add the @option{n} option to @command{git clean}, so it becomes @option{-fxdn}.
@end cartouche
-@node Executable names, Configure and build in RAM, Installation directory,
Configuring
-@subsubsection Executable names
+Besides the @file{bootstrap} and @file{bootstrap.conf}, the
@file{bootstrapped/} directory and @file{README-hacking} file are also related
to the bootstrapping process.
+The former hosts all the imported (bootstrapped) directories.
+Thus, in the version controlled source, it only contains a @file{README} file,
but in the distributed tarball it also contains sub-directories filled with all
bootstrapped files.
+@file{README-hacking} contains a summary of the bootstrapping process
discussed in this section.
+It is a necessary reference when you have not built this book yet.
+It is thus not distributed in the Gnuastro tarball.
-@cindex Executable names
-@cindex Names of executables
-At first sight, the names of the executables for each program might seem to be
uncommonly long, for example, @command{astnoisechisel} or @command{astcrop}.
-We could have chosen terse (and cryptic) names like most programs do.
-We chose this complete naming convention (something like the commands in
@TeX{}) so you do not have to spend too much time remembering what the name of
a specific program was.
-Such complete names also enable you to easily search for the programs.
-@cindex Shell auto-complete
-@cindex Auto-complete in the shell
-To facilitate typing the names in, we suggest using the shell auto-complete.
-With this facility you can find the executable you want very easily.
-It is very similar to file name completion in the shell.
-For example, simply by typing the letters below (where @key{[TAB]} stands for
the Tab key on your keyboard)
+@node Synchronizing, , Bootstrapping, Version controlled source
+@subsubsection Synchronizing
+
+The bootstrapping script (see @ref{Bootstrapping}) is not regularly needed:
you mainly need it after you have cloned Gnuastro (once) and whenever you want
to re-import the files from Gnulib, or Autoconf
archives@footnote{@url{https://savannah.gnu.org/task/index.php?13993} is
defined for you to check if significant (for Gnuastro) updates are made in
these repositories, since the last time you pulled from them.} (not too common).
+However, Gnuastro developers are constantly working on Gnuastro and are
pushing their changes to the official repository.
+Therefore, your local Gnuastro clone will soon be out-dated.
+Gnuastro has two mailing lists dedicated to its developing activities (see
@ref{Developing mailing lists}).
+Subscribing to them can help you decide when to synchronize with the official
repository.
+
+To pull all the most recent work in Gnuastro, run the following command from
the top Gnuastro directory.
+If you do not already have a built system, ignore @command{make distclean}.
+The separate steps are described in detail afterwards.
@example
-$ ast[TAB][TAB]
+$ make distclean && git pull && autoreconf -f
@end example
@noindent
-you will get the list of all the available executables that start with
@command{ast} in your @command{PATH} environment variable directories.
-So, all the Gnuastro executables installed on your system will be listed.
-Typing the next letter for the specific program you want along with a Tab,
will limit this list until you get to your desired program.
-
-@cindex Names, customize
-@cindex Customize executable names
-In case all of this does not convince you and you still want to type short
names, some suggestions are given below.
-You should have in mind though, that if you are writing a shell script that
you might want to pass on to others, it is best to use the standard name
because other users might not have adopted the same customization.
-The long names also serve as a form of documentation in such scripts.
-A similar reasoning can be given for option names in scripts: it is good
practice to always use the long formats of the options in shell scripts, see
@ref{Options}.
-
-@cindex Symbolic link
-The simplest solution is making a symbolic link to the actual executable.
-For example, let's assume you want to type @file{ic} to run Crop instead of
@file{astcrop}.
-Assuming you installed Gnuastro executables in @file{/usr/local/bin} (default)
you can do this simply by running the following command as root:
+You can also run the commands separately:
@example
-# ln -s /usr/local/bin/astcrop /usr/local/bin/ic
+$ make distclean
+$ git pull
+$ autoreconf -f
@end example
-@noindent
-In case you update Gnuastro and a new version of Crop is installed, the
-default executable name is the same, so your custom symbolic link still
-works.
+@cindex GNU Autoconf
+@cindex Mailing list: info-gnuastro
+@cindex @code{info-gnuastro@@gnu.org}
+If Gnuastro was already built in this directory, you do not want some outputs
from the previous version being mixed with outputs from the newly pulled work.
+Therefore, the first step is to clean/delete all the built files with
@command{make distclean}.
+Fortunately the GNU build system allows the separation of source and built
files (in separate directories).
+This is a great feature to keep your source directory clean and you can use it
to avoid the cleaning step.
+Gnuastro comes with a script with some useful options for this job.
+It is useful if you regularly pull recent changes, see @ref{Separate build and
source directories}.
-@vindex --program-prefix
-@vindex --program-suffix
-@vindex --program-transform-name
-The installed executable names can also be set using options to @command{$
./configure}, see @ref{Configuring}.
-GNU Autoconf (which configures Gnuastro for your particular system), allows
the builder to change the name of programs with the three options
@option{--program-prefix}, @option{--program-suffix} and
@option{--program-transform-name}.
-The first two are for adding a fixed prefix or suffix to all the programs that
will be installed.
-This will actually make all the names longer! You can use it to add versions
of program names to the programs in order to simultaneously have two executable
versions of a program.
+After the pull, we must re-configure Gnuastro with @command{autoreconf -f}
(part of GNU Autoconf).
+It will update the @file{./configure} script and all the
@file{Makefile.in}@footnote{In the GNU build system, @command{./configure} will
use the @file{Makefile.in} files to create the necessary @file{Makefile} files
that are later read by @command{make} to build the package.} files based on the
hand-written configurations (in @file{configure.ac} and the @file{Makefile.am}
files).
+After running @command{autoreconf -f}, a warning about @code{TEXI2DVI} might
show up, you can ignore that.
-@cindex SED, stream editor
-@cindex Stream editor, SED
-The third configure option allows you to set the executable name at install
time using the SED program.
-SED is a very useful `stream editor'.
-There are various resources on the internet to use it effectively.
-However, we should caution that using configure options will change the actual
executable name of the installed program and on every re-install (an update for
example), you have to also add this option to keep the old executable name
updated.
-Also note that the documentation or configuration files do not change from
their standard names either.
+The most important reason for re-building Gnuastro's build system is to
generate/update the version number for your updated Gnuastro snapshot.
+This generated version number will include the commit information (see
@ref{Version numbering}).
+The version number is included in nearly all outputs of Gnuastro's programs,
therefore it is vital for reproducing an old result.
-@cindex Removing @file{ast} from executables
-For example, let's assume that typing @file{ast} on every invocation of every
program is really annoying you! You can remove this prefix from all the
executables at configure time by adding this option:
+As a summary, be sure to run `@command{autoreconf -f}' after every change in
the Git history.
+This includes synchronization with the main server or even a commit you have
made yourself.
+
+If you would like to see what has changed since you last synchronized your
local clone, you can take the following steps instead of the simple command
above (do not type anything after @code{#}):
@example
-$ ./configure --program-transform-name='s/ast/ /'
+$ git checkout master # Confirm if you are on master.
+$ git fetch origin # Fetch all new commits from server.
+$ git log master..origin/master # See all the new commit messages.
+$ git merge origin/master # Update your master branch.
+$ autoreconf -f # Update the build system.
@end example
+@noindent
+By default @command{git log} prints the most recent commit first, add the
@option{--reverse} option to see the changes chronologically.
+To see exactly what has been changed in the source code along with the commit
message, add a @option{-p} option to the @command{git log}.
+If you want to make changes in the code, have a look at @ref{Developing} to
get started easily.
+Be sure to commit your changes in a separate branch (keep your @code{master}
branch to follow the official repository) and re-run @command{autoreconf -f}
after the commit.
+If you intend to send your work to us, you can safely use your commit since it
will be ultimately recorded in Gnuastro's official history.
+If not, please upload your separate branch to a public hosting service, for
example, @url{https://codeberg.org, Codeberg}, and link to it in your
report/paper.
+Alternatively, run @command{make distcheck} and upload the output
@file{gnuastro-X.X.X.XXXX.tar.gz} to a publicly accessible web page so your
results can be considered scientific (reproducible) later.
-@node Configure and build in RAM, , Executable names, Configuring
-@subsubsection Configure and build in RAM
-@cindex File I/O
-@cindex Input/Output, file
-Gnuastro's configure and build process (the GNU build system) involves the
creation, reading, and modification of a large number of files (input/output,
or I/O).
-Therefore file I/O issues can directly affect the work of developers who need
to configure and build Gnuastro numerous times.
-Some of these issues are listed below:
-@itemize
-@cindex HDD
-@cindex SSD
-@item
-I/O will cause wear and tear on both the HDDs (mechanical failures) and
-SSDs (decreasing the lifetime).
-@cindex Backup
-@item
-Having the built files mixed with the source files can greatly affect backing
up (synchronization) of source files (since it involves the management of a
large number of small files that are regularly changed.
-Backup software can of course be configured to ignore the built files and
directories.
-However, since the built files are mixed with the source files and can have a
large variety, this will require a high level of customization.
-@end itemize
-@cindex tmpfs file system
-@cindex file systems, tmpfs
-One solution to address both these problems is to use the
@url{https://en.wikipedia.org/wiki/Tmpfs, tmpfs file system}.
-Any file in tmpfs is actually stored in the RAM (and possibly SWAP), not on
HDDs or SSDs.
-The RAM is built for extensive and fast I/O.
-Therefore the large number of file I/Os associated with configuring and
building will not harm the HDDs or SSDs.
-Due to the volatile nature of RAM, files in the tmpfs file-system will be
permanently lost after a power-off.
-Since all configured and built files are derivative files (not files that have
been directly written by hand) there is no problem in this and this feature can
be considered as an automatic cleanup.
-@cindex Linux kernel
-@cindex GNU C library
-@cindex GNU build system
-The modern GNU C library (and thus the Linux kernel) defines the
@file{/dev/shm} directory for this purpose in the RAM (POSIX shared memory).
-To build in it, you can use the GNU build system's ability to build in a
separate directory (not necessarily in the source directory) as shown below.
-Just set @file{SRCDIR} as the address of Gnuastro's top source directory (for
example, where there is the unpacked tarball).
-@example
-$ SRCDIR=/home/username/gnuastro
-$ mkdir /dev/shm/tmp-gnuastro-build
-$ cd /dev/shm/tmp-gnuastro-build
-$ $SRCDIR/configure --srcdir=$SRCDIR
-$ make
-@end example
-Gnuastro comes with a script to simplify this process of configuring and
building in a different directory (a ``clean'' build), for more see
@ref{Separate build and source directories}.
-@node Separate build and source directories, Tests, Configuring, Build and
install
-@subsection Separate build and source directories
-The simple steps of @ref{Quick start} will mix the source and built files.
-This can cause inconvenience for developers or enthusiasts following the most
recent work (see @ref{Version controlled source}).
-The current section is mainly focused on this later group of Gnuastro users.
-If you just install Gnuastro on major releases (following
@ref{Announcements}), you can safely ignore this section.
-@cindex GNU build system
-When it is necessary to keep the source (which is under version control), but
not the derivative (built) files (after checking or installing), the best
solution is to keep the source and the built files in separate directories.
-One application of this is already discussed in @ref{Configure and build in
RAM}.
-To facilitate this process of configuring and building in a separate
directory, Gnuastro comes with the @file{developer-build} script.
-It is available in the top source directory and is @emph{not} installed.
-It will make a directory under a given top-level directory (given to
@option{--top-build-dir}) and build Gnuastro there.
-It thus keeps the source completely separated from the built files.
-For easy access to the built files, it also makes a symbolic link to the built
directory in the top source files called @file{build}.
-When running the developer-build script without any options in the Gnuastro's
top source directory, default values will be used for its configuration.
-As with Gnuastro's programs, you can inspect the default values with
@option{-P} (or @option{--printparams}, the output just looks a little
different here).
-The default top-level build directory is @file{/dev/shm}: the shared memory
directory in RAM on GNU/Linux systems as described in @ref{Configure and build
in RAM}.
-
-@cindex Debug
-Besides these, it also has some features to facilitate the job of developers
or bleeding edge users like the @option{--debug} option to do a fast build,
with debug information, no optimization, and no shared libraries.
-Here is the full list of options you can feed to this script to configure its
operations.
+@node Build and install, , Downloading the source, Installation
+@section Build and install
-@cartouche
-@noindent
-@strong{Not all Gnuastro's common program behavior usable here:}
-@file{developer-build} is just a non-installed script with a very limited
scope as described above.
-It thus does not have all the common option behaviors or configuration files
for example.
-@end cartouche
+This section is basically a longer explanation to the sequence of commands
given in @ref{Quick start}.
+If you did not have any problems during the @ref{Quick start} steps, you want
to have all the programs of Gnuastro installed in your system, you do not want
to change the executable names during or after installation, you have root
access to install the programs in the default system wide directory, the Letter
paper size of the print book is fine for you or as a summary you do not feel
like going into the details when everything is working, you can safely skip
this section.
-@cartouche
-@noindent
-@strong{White space between option and value:} @file{developer-build}
-does not accept an @key{=} sign between the options and their values.
-It also needs at least one character between the option and its value.
-Therefore @option{-n 4} or @option{--numthreads 4} are acceptable, while
@option{-n4}, @option{-n=4}, or @option{--numthreads=4} are not.
-Finally multiple short option names cannot be merged: for example, you can say
@option{-c -n 4}, but unlike Gnuastro's programs, @option{-cn4} is not
acceptable.
-@end cartouche
+If you have any of the above problems or you want to understand the details
for a better control over your build and install, read along.
+The dependencies which you will need prior to configuring, building and
installing Gnuastro are explained in @ref{Dependencies}.
+The first three steps in @ref{Quick start} need no extra explanation, so we
will skip them and start with an explanation of Gnuastro specific configuration
options and a discussion on the installation directory in @ref{Configuring},
followed by some smaller subsections: @ref{Tests}, @ref{A4 print book}, and
@ref{Known issues} which explains the solutions to known problems you might
encounter in the installation steps and ways you can solve them.
-@cartouche
-@noindent
-@strong{Reusable for other packages:} This script can be used in any software
which is configured and built using the GNU Build System.
-Just copy it in the top source directory of that software and run it from
there.
-@end cartouche
-@cartouche
-@noindent
-@strong{Example usage:} See @ref{Forking tutorial} for an example usage of
this script in some scenarios.
-@end cartouche
+@menu
+* Configuring:: Configure Gnuastro
+* Separate build and source directories:: Keeping derivate/build files
separate.
+* Tests:: Run tests to see if it is working.
+* A4 print book:: Customize the print book.
+* Known issues:: Issues you might encounter.
+@end menu
-@table @option
-@item -b STR
-@itemx --top-build-dir STR
-The top build directory to make a directory for the build.
-If this option is not called, the top build directory is @file{/dev/shm} (only
available in GNU/Linux operating systems, see @ref{Configure and build in RAM}).
-@item -V
-@itemx --version
-Print the version string of Gnuastro that will be used in the build.
-This string will be appended to the directory name containing the built files.
-@item -a
-@itemx --autoreconf
-Run @command{autoreconf -f} before building the package.
-In Gnuastro, this is necessary when a new commit has been made to the project
history.
-In Gnuastro's build system, the Git description will be used as the version,
see @ref{Version numbering} and @ref{Synchronizing}.
-@item -c
-@itemx --clean
-@cindex GNU Autoreconf
-Delete the contents of the build directory (clean it) before starting the
configuration and building of this run.
+@node Configuring, Separate build and source directories, Build and install,
Build and install
+@subsection Configuring
-This is useful when you have recently pulled changes from the main Git
repository, or committed a change yourself and ran @command{autoreconf -f}, see
@ref{Synchronizing}.
-After running GNU Autoconf, the version will be updated and you need to do a
clean build.
+@pindex ./configure
+@cindex Configuring
+The @command{$ ./configure} step is the most important step in the build and
install process.
+All the required packages, libraries, headers and environment variables are
checked in this step.
+The behaviors of make and make install can also be set through command-line
options to this command.
-@item -d
-@itemx --debug
-@cindex Valgrind
-@cindex GNU Debugger (GDB)
-Build with debugging flags (for example, to use in GNU Debugger, also known as
GDB, or Valgrind), disable optimization and also the building of shared
libraries.
-Similar to running the configure script of below
+@cindex Configure options
+@cindex Customizing installation
+@cindex Installation, customizing
+The configure script accepts various arguments and options which enable the
final user to highly customize whatever she is building.
+The options to configure are generally very similar to normal program options
explained in @ref{Arguments and options}.
+Similar to all GNU programs, you can get a full list of the options along with
a short explanation by running
@example
-$ ./configure --enable-debug
+$ ./configure --help
@end example
-Besides all the debugging advantages of building with this option, it will
also be significantly speed up the build (at the cost of slower built programs).
-So when you are testing something small or working on the build system itself,
it will be much faster to test your work with this option.
-
-@item -v
-@itemx --valgrind
-@cindex Valgrind
-Build all @command{make check} tests within Valgrind.
-For more, see the description of @option{--enable-check-with-valgrind} in
@ref{Gnuastro configure options}.
+@noindent
+@cindex GNU Autoconf
+A complete explanation is also included in the @file{INSTALL} file.
+Note that this file was written by the authors of GNU Autoconf (which builds
the @file{configure} script), therefore it is common for all programs which use
the @command{$ ./configure} script for building and installing, not just
Gnuastro.
+Here we only discuss cases where you do not have superuser access to the
system and if you want to change the executable names.
+But before that, a review of the options to configure that are particular to
Gnuastro are discussed.
-@item -j INT
-@itemx --jobs INT
-The maximum number of threads/jobs for Make to build at any moment.
-As the name suggests (Make has an identical option), the number given to this
option is directly passed on to any call of Make with its @option{-j} option.
+@menu
+* Gnuastro configure options:: Configure options particular to Gnuastro.
+* Installation directory:: Specify the directory to install.
+* Executable names:: Changing executable names.
+* Configure and build in RAM:: For minimal use of HDD or SSD, and clean
source.
+@end menu
-@item -C
-@itemx --check
-After finishing the build, also run @command{make check}.
-By default, @command{make check} is not run because the developer usually has
their own checks to work on (for example, defined in
@file{tests/during-dev.sh}).
+@node Gnuastro configure options, Installation directory, Configuring,
Configuring
+@subsubsection Gnuastro configure options
-@item -i
-@itemx --install
-After finishing the build, also run @command{make install}.
+@cindex @command{./configure} options
+@cindex Configure options particular to Gnuastro
+Most of the options to configure (which are to do with building) are similar
for every program which uses this script.
+Here the options that are particular to Gnuastro are discussed.
+The next topics explain the usage of other configure options which can be
applied to any program using the GNU build system (through the configure
script).
-@item -D
-@itemx --dist
-Run @code{make dist-lzip pdf} to build a distribution tarball (in
@file{.tar.lz} format) and a PDF manual.
-This can be useful for archiving, or sending to colleagues who do not use Git
for an easy build and manual.
+@vtable @option
-@item -u STR
-@item --upload STR
-Activate the @option{--dist} (@option{-D}) option, then use secure copy
(@command{scp}, part of the SSH tools) to copy the tarball and PDF to the
@file{src} and @file{pdf} sub-directories of the specified server and its
directory (value to this option).
-For example, @command{--upload my-server:dir}, will copy the tarball in the
@file{dir/src}, and the PDF manual in @file{dir/pdf} of @code{my-server} server.
-It will then make a symbolic link in the top server directory to the tarball
that is called @file{gnuastro-latest.tar.lz}.
+@item --enable-debug
+@cindex Valgrind
+@cindex Debugging
+@cindex GNU Debugger
+Compile/build Gnuastro with debugging information, no optimization and without
shared libraries.
-@item -p STR
-@itemx --publish=STR
-Clean, bootstrap, build, check and upload the checked tarball and PDF of the
book to the URL given as @code{STR}.
-This option is just a wrapper for @option{--autoreconf --clean --debug --check
--upload STR}.
-@option{--debug} is added because it will greatly speed up the build.
-@option{--debug} will have no effect on the produced tarball (people who later
download will be building with the default optimized, and non-debug mode).
-This option is good when you have made a commit and are ready to publish it on
your server (if nothing crashes).
-Recall that if any of the previous steps fail the script aborts.
+In order to allow more efficient programs when using Gnuastro (after the
installation), by default Gnuastro is built with a 3rd level (a very high
level) optimization and no debugging information.
+By default, libraries are also built for static @emph{and} shared linking (see
@ref{Linking}).
+However, when there are crashes or unexpected behavior, these three features
can hinder the process of localizing the problem.
+This configuration option is identical to manually calling the configuration
script with @code{CFLAGS="-g -O0" --disable-shared}.
-@item -I
-@item --install-archive
-Short for @option{--autoreconf --clean --check --install --dist}.
-This is useful when you actually want to install the commit you just made (if
the build and checks succeed).
-It will also produce a distribution tarball and PDF manual for easy access to
the installed tarball on your system at a later time.
+In the (rare) situations where you need to do your debugging on the shared
libraries, do not use this option.
+Instead run the configure script by explicitly setting @code{CFLAGS} like this:
+@example
+$ ./configure CFLAGS="-g -O0"
+@end example
-Ideally, Gnuastro's Git version history makes it easy for a prepared system to
revert back to a different point in history.
-But Gnuastro also needs to bootstrap files and also your collaborators might
(usually do!) find it too much of a burden to do the bootstrapping themselves.
-So it is convenient to have a tarball and PDF manual of the version you have
installed (and are using in your research) handily available.
+@item --enable-check-with-valgrind
+@cindex Valgrind
+Do the @command{make check} tests through Valgrind.
+Therefore, if any crashes or memory-related issues (segmentation faults in
particular) occur in the tests, the output of Valgrind will also be put in the
@file{tests/test-suite.log} file without having to manually modify the check
scripts.
+This option will also activate Gnuastro's debug mode (see the
@option{--enable-debug} configure-time option described above).
-@item -h
-@itemx --help
-@itemx -P
-@itemx --printparams
-Print a description of this script along with all the options and their
-current values.
+Valgrind is free software.
+It is a program for easy checking of memory-related issues in programs.
+It runs a program within its own controlled environment and can thus identify
the exact line-number in the program's source where a memory-related issue
occurs.
+However, it can significantly slow-down the tests.
+So this option is only useful when a segmentation fault is found during
@command{make check}.
-@end table
+@item --enable-progname
+Only build and install @file{progname} along with any other program that is
enabled in this fashion.
+@file{progname} is the name of the executable without the @file{ast}, for
example, @file{crop} for Crop (with the executable name of @file{astcrop}).
+Note that by default all the programs will be installed.
+This option (and the @option{--disable-progname} options) are only relevant
when you do not want to install all the programs.
+Therefore, if this option is called for any of the programs in Gnuastro, any
program which is not explicitly enabled will not be built or installed.
+@item --disable-progname
+@itemx --enable-progname=no
+Do not build or install the program named @file{progname}.
+This is very similar to the @option{--enable-progname}, but will build and
install all the other programs except this one.
-@node Tests, A4 print book, Separate build and source directories, Build and
install
-@subsection Tests
+@cartouche
+@noindent
+@strong{Note:} If some programs are enabled and some are disabled, it is
equivalent to simply enabling those that were enabled.
+Listing the disabled programs is redundant.
+@end cartouche
-@cindex @command{make check}
-@cindex @file{mock.fits}
-@cindex Tests, running
-@cindex Checking tests
-After successfully building (compiling) the programs with the @command{$ make}
command you can check the installation before installing.
-To run the tests, run
+@item --enable-gnulibcheck
+@cindex GNU C library
+@cindex Gnulib: GNU Portability Library
+@cindex GNU Portability Library (Gnulib)
+Enable checks on the GNU Portability Library (Gnulib).
+Gnulib is used by Gnuastro to enable users of non-GNU based operating systems
(that do not use GNU C library or glibc) to compile and use the advanced
features that this library provides.
+We make extensive use of such functions.
+If you give this option to @command{$ ./configure}, when you run @command{$
make check}, first the functions in Gnulib will be tested, then the Gnuastro
executables.
+If your operating system does not support glibc or has an older version of it
and you have problems in the build process (@command{$ make}), you can give
this flag to configure to see if the problem is caused by Gnulib not supporting
your operating system or Gnuastro, see @ref{Known issues}.
+@item --disable-guide-message
+@itemx --enable-guide-message=no
+Do not print a guiding message during the GNU Build process of @ref{Quick
start}.
+By default, after each step, a message is printed guiding the user what the
next command should be.
+Therefore, after @command{./configure}, it will suggest running @command{make}.
+After @command{make}, it will suggest running @command{make check} and so on.
+If Gnuastro is configured with this option, for example
@example
-$ make check
+$ ./configure --disable-guide-message
@end example
+Then these messages will not be printed after any step (like most programs).
+For people who are not yet fully accustomed to this build system, these
guidelines can be very useful and encouraging.
+However, if you find those messages annoying, use this option.
-For every program some tests are designed to check some possible operations.
-Running the command above will run those tests and give you a final report.
-If everything is OK and you have built all the programs, all the tests should
pass.
-In case any of the tests fail, please have a look at @ref{Known issues} and if
that still does not fix your problem, look that the
@file{./tests/test-suite.log} file to see if the source of the error is
something particular to your system or more general.
-If you feel it is general, please contact us because it might be a bug.
-Note that the tests of some programs depend on the outputs of other program's
tests, so if you have not installed them they might be skipped or fail.
-Prior to releasing every distribution all these tests are checked.
-If you have a reasonably modern terminal, the outputs of the successful tests
will be colored green and the failed ones will be colored red.
-
-These scripts can also act as a good set of examples for you to see how the
programs are run.
-All the tests are in the @file{tests/} directory.
-The tests for each program are shell scripts (ending with @file{.sh}) in a
sub-directory of this directory with the same name as the program.
-See @ref{Test scripts} for more detailed information about these scripts in
case you want to inspect them.
+@item --without-libgit2
+@cindex Git
+@pindex libgit2
+@cindex Version control systems
+Build Gnuastro without libgit2 (for including Git commit hashes in output
files), see @ref{Optional dependencies}.
+libgit2 is an optional dependency, with this option, Gnuastro will ignore any
possibly existing libgit2 that may already be on the system.
+@item --without-libjpeg
+@pindex libjpeg
+@cindex JPEG format
+Build Gnuastro without libjpeg (for reading/writing to JPEG files), see
@ref{Optional dependencies}.
+libjpeg is an optional dependency, with this option, Gnuastro will ignore any
possibly existing libjpeg that may already be on the system.
+@item --without-libtiff
+@pindex libtiff
+@cindex TIFF format
+Build Gnuastro without libtiff (for reading/writing to TIFF files), see
@ref{Optional dependencies}.
+libtiff is an optional dependency, with this option, Gnuastro will ignore any
possibly existing libtiff that may already be on the system.
+@item --with-python
+@cindex PyPI
+@cindex Python
+Build the Python interface within Gnuastro's dynamic library.
+This interface can be used for easy communication with Python wrappers (for
example, the pyGnuastro package).
-@node A4 print book, Known issues, Tests, Build and install
-@subsection A4 print book
+When you install the pyGnuastro package from PyPI, the correct configuration
of the Gnuastro Library is already packaged with it (with the Python interface)
and that is independent of your Gnuastro installation.
+The Python interface is only necessary if you want to build pyGnuastro from
source (which is only necessary for developers).
+Therefore it has to be explicitly activated at configure time with this option.
+For more on the interface functions, see @ref{Python interface}.
-@cindex A4 print book
-@cindex Modifying print book
-@cindex A4 paper size
-@cindex US letter paper size
-@cindex Paper size, A4
-@cindex Paper size, US letter
-The default print version of this book is provided in the letter paper size.
-If you would like to have the print version of this book on paper and you are
living in a country which uses A4, then you can rebuild the book.
-The great thing about the GNU build system is that the book source code which
is in Texinfo is also distributed with the program source code, enabling you to
do such customization (hacking).
+@end vtable
-@cindex GNU Texinfo
-In order to change the paper size, you will need to have GNU Texinfo installed.
-Open @file{doc/gnuastro.texi} with any text editor.
-This is the source file that created this book.
-In the first few lines you will see this line:
+The tests of some programs might depend on the outputs of the tests of other
programs.
+For example, MakeProfiles is one the first programs to be tested when you run
@command{$ make check}.
+MakeProfiles' test outputs (FITS images) are inputs to many other programs
(which in turn provide inputs for other programs).
+Therefore, if you do not install MakeProfiles for example, the tests for many
the other programs will be skipped.
+To avoid this, in one run, you can install all the programs and run the tests
but not install.
+If everything is working correctly, you can run configure again with only the
programs you want.
+However, do not run the tests and directly install after building.
+
+
+
+@node Installation directory, Executable names, Gnuastro configure options,
Configuring
+@subsubsection Installation directory
+
+@vindex --prefix
+@cindex Superuser, not possible
+@cindex Root access, not possible
+@cindex No access to superuser install
+@cindex Install with no superuser access
+One of the most commonly used options to @file{./configure} is
@option{--prefix}, it is used to define the directory that will host all the
installed files (or the ``prefix'' in their final absolute file name).
+For example, when you are using a server and you do not have administrator or
root access.
+In this example scenario, if you do not use the @option{--prefix} option, you
will not be able to install the built files and thus access them from anywhere
without having to worry about where they are installed.
+However, once you prepare your startup file to look into the proper place (as
discussed thoroughly below), you will be able to easily use this option and
benefit from any software you want to install without having to ask the system
administrators or install and use a different version of a software that is
already installed on the server.
+
+The most basic way to run an executable is to explicitly write its full file
name (including all the directory information) and run it.
+One example is running the configuration script with the @command{$
./configure} command (see @ref{Quick start}).
+By giving a specific directory (the current directory or @file{./}), we are
explicitly telling the shell to look in the current directory for an executable
file named `@file{configure}'.
+Directly specifying the directory is thus useful for executables in the
current (or nearby) directories.
+However, when the program (an executable file) is to be used a lot, specifying
all those directories will become a significant burden.
+For example, the @file{ls} executable lists the contents in a given directory
and it is (usually) installed in the @file{/usr/bin/} directory by the
operating system maintainers.
+Therefore, if using the full address was the only way to access an executable,
each time you wanted a listing of a directory, you would have to run the
following command (which is very inconvenient, both in writing and in
remembering the various directories).
@example
-@@c@@afourpaper
+$ /usr/bin/ls
@end example
+@cindex Shell variables
+@cindex Environment variables
+To address this problem, we have the @file{PATH} environment variable.
+To understand it better, we will start with a short introduction to the shell
variables.
+Shell variable values are basically treated as strings of characters.
+For example, it does not matter if the value is a name (string of
@emph{alphabetic} characters), or a number (string of @emph{numeric}
characters), or both.
+You can define a variable and a value for it by running
+@example
+$ myvariable1=a_test_value
+$ myvariable2="a test value"
+@end example
@noindent
-In Texinfo, a line is commented with @code{@@c}.
-Therefore, un-comment this line by deleting the first two characters such that
it changes to:
-
+As you see above, if the value contains white space characters, you have to
put the whole value (including white space characters) in double quotes
(@key{"}).
+You can see the value it represents by running
@example
-@@afourpaper
+$ echo $myvariable1
+$ echo $myvariable2
@end example
-
@noindent
-Save the file and close it.
-You can now run the following command
+@cindex Environment
+@cindex Environment variables
+If a variable has no value or it was not defined, the last command will only
print an empty line.
+A variable defined like this will be known as long as this shell or terminal
is running.
+Other terminals will have no idea it existed.
+The main advantage of shell variables is that if they are exported@footnote{By
running @command{$ export myvariable=a_test_value} instead of the simpler case
in the text}, subsequent programs that are run within that shell can access
their value.
+So by changing their value, you can change the ``environment'' of a program
which uses them.
+The shell variables which are accessed by programs are therefore known as
``environment variables''@footnote{You can use shell variables for other
actions too, for example, to temporarily keep some names or run loops on some
files.}.
+You can see the full list of exported variables that your shell recognizes by
running:
@example
-$ make pdf
+$ printenv
@end example
-@noindent
-and the new PDF book will be available in @file{SRCdir/doc/gnuastro.pdf}.
-By changing the @command{pdf} in @command{$ make pdf} to @command{ps} or
@command{dvi} you can have the book in those formats.
-Note that you can do this for any book that is in Texinfo format, they might
not have @code{@@afourpaper} line, so you can add it close to the top of the
Texinfo source file.
+@cindex @file{HOME}
+@cindex @file{HOME/.local/}
+@cindex Environment variable, @code{HOME}
+@file{HOME} is one commonly used environment variable, it is any user's (the
one that is logged in) top directory.
+Try finding it in the command above.
+It is used so often that the shell has a special expansion (alternative) for
it: `@file{~}'.
+Whenever you see file names starting with the tilde sign, it actually
represents the value to the @file{HOME} environment variable, so @file{~/doc}
is the same as @file{$HOME/doc}.
+@vindex PATH
+@pindex ./configure
+@cindex Setting @code{PATH}
+@cindex Default executable search directory
+@cindex Search directory for executables
+Another one of the most commonly used environment variables is @file{PATH}, it
is a list of directories to search for executable names.
+Its value is a list of directories (separated by a colon, or `@key{:}').
+When the address of the executable is not explicitly given (like
@file{./configure} above), the system will look for the executable in the
directories specified by @file{PATH}.
+If you have a computer nearby, try running the following command to see which
directories your system will look into when it is searching for executable
(binary) files, one example is printed here (notice how @file{/usr/bin}, in the
@file{ls} example above, is one of the directories in @command{PATH}):
+@example
+$ echo $PATH
+/usr/local/sbin:/usr/local/bin:/usr/bin
+@end example
+By default @file{PATH} usually contains system-wide directories, which are
readable (but not writable) by all users, like the above example.
+Therefore if you do not have root (or administrator) access, you need to add
another directory to @file{PATH} which you actually have write access to.
+The standard directory where you can keep installed files (not just
executables) for your own user is the @file{~/.local/} directory.
+The names of hidden files start with a `@key{.}' (dot), so it will not show up
in your common command-line listings, or on the graphical user interface.
+You can use any other directory, but this is the most recognized.
-@node Known issues, , A4 print book, Build and install
-@subsection Known issues
+The top installation directory will be used to keep all the package's
components: programs (executables), libraries, include (header) files, shared
data (like manuals), or configuration files (see @ref{Review of library
fundamentals} for a thorough introduction to headers and linking).
+So it commonly has some of the following sub-directories for each class of
installed components respectively: @file{bin/}, @file{lib/}, @file{include/}
@file{man/}, @file{share/}, @file{etc/}.
+Since the @file{PATH} variable is only used for executables, you can add the
@file{~/.local/bin} directory (which keeps the executables/programs or more
generally, ``binary'' files) to @file{PATH} with the following command.
+As defined below, first the existing value of @file{PATH} is used, then your
given directory is added to its end and the combined value is put back in
@file{PATH} (run `@command{$ echo $PATH}' afterwards to check if it was added).
-Depending on your operating system and the version of the compiler you are
using, you might confront some known problems during the configuration
(@command{$ ./configure}), compilation (@command{$ make}) and tests (@command{$
make check}).
-Here, their solutions are discussed.
+@example
+$ PATH=$PATH:~/.local/bin
+@end example
-@itemize
-@cindex Configuration, not finding library
-@cindex Development packages
-@item
-@command{$ ./configure}: @emph{Configure complains about not finding a library
even though you have installed it.}
-The possible solution is based on how you installed the package:
+@cindex GNU Bash
+@cindex Startup scripts
+@cindex Scripts, startup
+Any executable that you installed in @file{~/.local/bin} will now be usable
without having to remember and write its full address.
+However, as soon as you leave/close your current terminal session, this
modified @file{PATH} variable will be forgotten.
+Adding the directories which contain executables to the @file{PATH}
environment variable each time you start a terminal is also very inconvenient
and prone to errors.
+Fortunately, there are standard `startup files' defined by your shell
precisely for this (and other) purposes.
+There is a special startup file for every significant starting step:
-@itemize
-@item
-From your distribution's package manager.
-Most probably this is because your distribution has separated the header files
of a library from the library parts.
-Please also install the `development' packages for those libraries too.
-Just add a @file{-dev} or @file{-devel} to the end of the package name and
re-run the package manager.
-This will not happen if you install the libraries from source.
-When installed from source, the headers are also installed.
+@table @asis
-@item
-@cindex @command{LDFLAGS}
-From source.
-Then your linker is not looking where you installed the library.
-If you followed the instructions in this chapter, all the libraries will be
installed in @file{/usr/local/lib}.
-So you have to tell your linker to look in this directory.
-To do so, configure Gnuastro like this:
+@cindex GNU Bash
+@item @file{/etc/profile} and everything in @file{/etc/profile.d/}
+These startup scripts are called when your whole system starts (for example,
after you turn on your computer).
+Therefore you need administrator or root privileges to access or modify them.
-@example
-$ ./configure LDFLAGS="-L/usr/local/lib"
-@end example
+@item @file{~/.bash_profile}
+If you are using (GNU) Bash as your shell, the commands in this file are run,
when you log in to your account @emph{through Bash}.
+Most commonly when you login through the virtual console (where there is no
graphic user interface).
-If you want to use the libraries for your other programming projects, then
-export this environment variable in a start-up script similar to the case
-for @file{LD_LIBRARY_PATH} explained below, also see @ref{Installation
-directory}.
-@end itemize
+@item @file{~/.bashrc}
+If you are using (GNU) Bash as your shell, the commands here will be run each
time you start a terminal and are already logged in.
+For example, when you open your terminal emulator in the graphic user
interface.
-@item
-@vindex --enable-gnulibcheck
-@cindex Gnulib: GNU Portability Library
-@cindex GNU Portability Library (Gnulib)
-@command{$ make}: @emph{Complains about an unknown function on a non-GNU based
operating system.}
-In this case, please run @command{$ ./configure} with the
@option{--enable-gnulibcheck} option to see if the problem is from the GNU
Portability Library (Gnulib) not supporting your system or if there is a
problem in Gnuastro, see @ref{Gnuastro configure options}.
-If the problem is not in Gnulib and after all its tests you get the same
complaint from @command{make}, then please contact us at
@file{bug-gnuastro@@gnu.org}.
-The cause is probably that a function that we have used is not supported by
your operating system and we did not included it along with the source tarball.
-If the function is available in Gnulib, it can be fixed immediately.
+@end table
-@item
-@cindex @command{CPPFLAGS}
-@command{$ make}: @emph{Cannot find the headers (.h files) of installed
libraries.}
-Your C preprocessor (CPP) is not looking in the right place.
-To fix this, configure Gnuastro with an additional @code{CPPFLAGS} like below
(assuming the library is installed in @file{/usr/local/include}:
+For security reasons, it is highly recommended to directly type in your
@file{HOME} directory value by hand in startup files instead of using variables.
+So in the following, let's assume your user name is `@file{name}' (so @file{~}
may be replaced with @file{/home/name}).
+To add @file{~/.local/bin} to your @file{PATH} automatically on any startup
file, you have to ``export'' the new value of @command{PATH} in the startup
file that is most relevant to you by adding this line:
@example
-$ ./configure CPPFLAGS="-I/usr/local/include"
+export PATH=$PATH:/home/name/.local/bin
@end example
-If you want to use the libraries for your other programming projects, then
export this environment variable in a start-up script similar to the case for
@file{LD_LIBRARY_PATH} explained below, also see @ref{Installation directory}.
+@cindex GNU build system
+@cindex Install directory
+@cindex Directory, install
+Now that you know your system will look into @file{~/.local/bin} for
executables, you can tell Gnuastro's configure script to install everything in
the top @file{~/.local} directory using the @option{--prefix} option.
+When you subsequently run @command{$ make install}, all the install-able files
will be put in their respective directory under @file{~/.local/} (the
executables in @file{~/.local/bin}, the compiled library files in
@file{~/.local/lib}, the library header files in @file{~/.local/include} and so
on, to learn more about these different files, please see @ref{Review of
library fundamentals}).
+Note that tilde (`@key{~}') expansion will not happen if you put a `@key{=}'
between @option{--prefix} and @file{~/.local}@footnote{If you insist on using
`@key{=}', you can use @option{--prefix=$HOME/.local}.}, so we have avoided the
@key{=} character here which is optional in GNU-style options, see
@ref{Options}.
-@cindex Tests, only one passes
+@example
+$ ./configure --prefix ~/.local
+@end example
+
+@cindex @file{MANPATH}
+@cindex @file{INFOPATH}
@cindex @file{LD_LIBRARY_PATH}
-@item
-@command{$ make check}: @emph{Only the first couple of tests pass, all the
rest fail or get skipped.} It is highly likely that when searching for shared
libraries, your system does not look into the @file{/usr/local/lib} directory
(or wherever you installed Gnuastro or its dependencies).
-To make sure it is added to the list of directories, add the following line to
your @file{~/.bashrc} file and restart your terminal.
-Do Not forget to change @file{/usr/local/lib} if the libraries are installed
in other (non-standard) directories.
+@cindex Library search directory
+@cindex Default library search directory
+You can install everything (including libraries like GSL, CFITSIO, or WCSLIB
which are Gnuastro's mandatory dependencies, see @ref{Mandatory dependencies})
locally by configuring them as above.
+However, recall that @command{PATH} is only for executable files, not
libraries and that libraries can also depend on other libraries.
+For example, WCSLIB depends on CFITSIO and Gnuastro needs both.
+Therefore, when you installed a library in a non-recognized directory, you
have to guide the program that depends on them to look into the necessary
library and header file directories.
+To do that, you have to define the @command{LDFLAGS} and @command{CPPFLAGS}
environment variables respectively.
+This can be done while calling @file{./configure} as shown below:
@example
-export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib"
+$ ./configure LDFLAGS=-L/home/name/.local/lib \
+ CPPFLAGS=-I/home/name/.local/include \
+ --prefix ~/.local
@end example
-You can also add more directories by using a colon `@code{:}' to separate them.
-See @ref{Installation directory} and @ref{Linking} to learn more on the
@code{PATH} variables and dynamic linking respectively.
+It can be annoying/buggy to do this when configuring every software that
depends on such libraries.
+Hence, you can define these two variables in the most relevant startup file
(discussed above).
+The convention on using these variables does not include a colon to separate
values (as @command{PATH}-like variables do).
+They use white space characters and each value is prefixed with a compiler
option@footnote{These variables are ultimately used as options while building
the programs.
+Therefore every value has be an option name followed be a value as discussed
in @ref{Options}.}.
+Note the @option{-L} and @option{-I} above (see @ref{Options}), for
@option{-I} see @ref{Headers}, and for @option{-L}, see @ref{Linking}.
+Therefore we have to keep the value in double quotation signs to keep the
white space characters and adding the following two lines to the startup file
of choice:
-@cindex GPL Ghostscript
-@item
-@command{$ make check}: @emph{The tests relying on external programs (for
example, @file{fitstopdf.sh} fail.)} This is probably due to the fact that the
version number of the external programs is too old for the tests we have
preformed.
-Please update the program to a more recent version.
-For example, to create a PDF image, you will need GPL Ghostscript, but older
versions do not work, we have successfully tested it on version 9.15.
-Older versions might cause a failure in the test result.
+@example
+export LDFLAGS="$LDFLAGS -L/home/name/.local/lib"
+export CPPFLAGS="$CPPFLAGS -I/home/name/.local/include"
+@end example
-@item
-@cindex @TeX{}
-@cindex GNU Texinfo
-@command{$ make pdf}: @emph{The PDF book cannot be made.}
-To make a PDF book, you need to have the GNU Texinfo program (like any
program, the more recent the better).
-A working @TeX{} program is also necessary, which you can get from Tex
Live@footnote{@url{https://www.tug.org/texlive/}}.
-
-@item
-@cindex GNU Libtool
-After @code{make check}: do not copy the programs' executables to another (for
example, the installation) directory manually (using @command{cp}, or
@command{mv} for example).
-In the default configuration@footnote{If you configure Gnuastro with the
@option{--disable-shared} option, then the libraries will be statically linked
to the programs and this problem will not exist, see @ref{Linking}.}, the
program binaries need to link with Gnuastro's shared library which is also
built and installed with the programs.
-Therefore, to run successfully before and after installation, linking
modifications need to be made by GNU Libtool at installation time.
-@command{make install} does this internally, but a simple copy might give
linking errors when you run it.
-If you need to copy the executables, you can do so after installation.
-
-@cindex Tests, error in converting images
-@item
-@command{$ make} (when bootstrapping): After you have bootstrapped Gnuastro
from the version-controlled source, you may confront the following (or a
similar) error when converting images (for more on bootstrapping, see
@ref{Bootstrapping}):
+@cindex Dynamic libraries
+Dynamic libraries are linked to the executable every time you run a program
that depends on them (see @ref{Linking} to fully understand this important
concept).
+Hence dynamic libraries also require a special path variable called
@command{LD_LIBRARY_PATH} (same formatting as @command{PATH}).
+To use programs that depend on these libraries, you need to add
@file{~/.local/lib} to your @command{LD_LIBRARY_PATH} environment variable by
adding the following line to the relevant start-up file:
@example
-@code{convert: attempt to perform an operation not allowed by the
-security policy `gs' @ error/delegate.c/ExternalDelegateCommand/378.}
+export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/name/.local/lib
@end example
-This error is a known
issue@footnote{@url{https://wiki.archlinux.org/title/ImageMagick}} with
@code{ImageMagick} security policies in some operating systems.
-In short, @code{imagemagick} uses Ghostscript for PDF, EPS, PS and XPS parsing.
-However, because some security vulnerabilities have been found in
Ghostscript@footnote{@url{https://security.archlinux.org/package/ghostscript}},
by default, ImageMagick may be compiled without Ghostscript library.
-In such cases, if allowed, ImageMagick will fall back to the external
@command{gs} command instead of the library.
-But this may be disabled with the following (or a similar) lines in
@code{/etc/ImageMagick-7/policy.xml} (anything related to PDF, PS, or
Ghostscript).
+If you also want to access the Info (see @ref{Info}) and man pages (see
@ref{Man pages}) documentations add @file{~/.local/share/info} and
@file{~/.local/share/man} to your @command{INFOPATH}@footnote{Info has the
following convention: ``If the value of @command{INFOPATH} ends with a colon
[or it is not defined] ..., the initial list of directories is constructed by
appending the build-time default to the value of @command{INFOPATH}.''
+So when installing in a non-standard directory and if @command{INFOPATH} was
not initially defined, add a colon to the end of @command{INFOPATH} as shown
below.
+Otherwise Info will not be able to find system-wide installed documentation:
+@*@command{echo 'export INFOPATH=$INFOPATH:/home/name/.local/share/info:' >>
~/.bashrc}@*
+Note that this is only an internal convention of Info: do not use it for other
@command{*PATH} variables.} and @command{MANPATH} environment variables
respectively.
+
+@cindex Search directory order
+@cindex Order in search directory
+A final note is that order matters in the directories that are searched for
all the variables discussed above.
+In the examples above, the new directory was added after the system specified
directories.
+So if the program, library or manuals are found in the system wide
directories, the user directory is no longer searched.
+If you want to search your local installation first, put the new directory
before the already existing list, like the example below.
@example
-<policy domain="delegate" rights="none" pattern="gs" />
-<policy domain="module" rights="none" pattern="@{PS,PDF,XPS@}" />
+export LD_LIBRARY_PATH=/home/name/.local/lib:$LD_LIBRARY_PATH
@end example
-To fix this problem, simply comment such lines (by placing a @code{<!--}
before each statement/line and @code{-->} at the end of that statement/line).
+@noindent
+This is good when a library, for example, CFITSIO, is already present on the
system, but the system-wide install was not configured with the correct
configuration flags (see @ref{CFITSIO}), or you want to use a newer version and
you do not have administrator or root access to update it on the whole
system/server.
+If you update @file{LD_LIBRARY_PATH} by placing @file{~/.local/lib} first
(like above), the linker will first find the CFITSIO you installed for yourself
and link with it.
+It thus will never reach the system-wide installation.
-@end itemize
+There are important security problems with using local installations first:
all important system-wide executables and libraries (important executables like
@command{ls} and @command{cp}, or libraries like the C library) can be replaced
by non-secure versions with the same file names and put in the customized
directory (@file{~/.local} in this example).
+So if you choose to search in your customized directory first, please @emph{be
sure} to keep it clean from executables or libraries with the same names as
important system programs or libraries.
+@cartouche
@noindent
-If your problem was not listed above, please file a bug report (@ref{Report a
bug}).
+@strong{Summary:} When you are using a server which does not give you
administrator/root access AND you would like to give priority to your own built
programs and libraries, not the version that is (possibly already) present on
the server, add these lines to your startup file.
+See above for which startup file is best for your case and for a detailed
explanation on each.
+Do Not forget to replace `@file{/YOUR-HOME-DIR}' with your home directory (for
example, `@file{/home/your-id}'):
+@example
+export PATH="/YOUR-HOME-DIR/.local/bin:$PATH"
+export LDFLAGS="-L/YOUR-HOME-DIR/.local/lib $LDFLAGS"
+export MANPATH="/YOUR-HOME-DIR/.local/share/man/:$MANPATH"
+export CPPFLAGS="-I/YOUR-HOME-DIR/.local/include $CPPFLAGS"
+export INFOPATH="/YOUR-HOME-DIR/.local/share/info/:$INFOPATH"
+export LD_LIBRARY_PATH="/YOUR-HOME-DIR/.local/lib:$LD_LIBRARY_PATH"
+@end example
+@noindent
+Afterwards, you just need to add an extra
@option{--prefix=/YOUR-HOME-DIR/.local} to the @file{./configure} command of
the software that you intend to install.
+Everything else will be the same as a standard build and install, see
@ref{Quick start}.
+@end cartouche
+@node Executable names, Configure and build in RAM, Installation directory,
Configuring
+@subsubsection Executable names
+@cindex Executable names
+@cindex Names of executables
+At first sight, the names of the executables for each program might seem to be
uncommonly long, for example, @command{astnoisechisel} or @command{astcrop}.
+We could have chosen terse (and cryptic) names like most programs do.
+We chose this complete naming convention (something like the commands in
@TeX{}) so you do not have to spend too much time remembering what the name of
a specific program was.
+Such complete names also enable you to easily search for the programs.
+@cindex Shell auto-complete
+@cindex Auto-complete in the shell
+To facilitate typing the names in, we suggest using the shell auto-complete.
+With this facility you can find the executable you want very easily.
+It is very similar to file name completion in the shell.
+For example, simply by typing the letters below (where @key{[TAB]} stands for
the Tab key on your keyboard)
+@example
+$ ast[TAB][TAB]
+@end example
+@noindent
+you will get the list of all the available executables that start with
@command{ast} in your @command{PATH} environment variable directories.
+So, all the Gnuastro executables installed on your system will be listed.
+Typing the next letter for the specific program you want along with a Tab,
will limit this list until you get to your desired program.
+@cindex Names, customize
+@cindex Customize executable names
+In case all of this does not convince you and you still want to type short
names, some suggestions are given below.
+You should have in mind though, that if you are writing a shell script that
you might want to pass on to others, it is best to use the standard name
because other users might not have adopted the same customization.
+The long names also serve as a form of documentation in such scripts.
+A similar reasoning can be given for option names in scripts: it is good
practice to always use the long formats of the options in shell scripts, see
@ref{Options}.
+@cindex Symbolic link
+The simplest solution is making a symbolic link to the actual executable.
+For example, let's assume you want to type @file{ic} to run Crop instead of
@file{astcrop}.
+Assuming you installed Gnuastro executables in @file{/usr/local/bin} (default)
you can do this simply by running the following command as root:
+@example
+# ln -s /usr/local/bin/astcrop /usr/local/bin/ic
+@end example
+@noindent
+In case you update Gnuastro and a new version of Crop is installed, the
+default executable name is the same, so your custom symbolic link still
+works.
+@vindex --program-prefix
+@vindex --program-suffix
+@vindex --program-transform-name
+The installed executable names can also be set using options to @command{$
./configure}, see @ref{Configuring}.
+GNU Autoconf (which configures Gnuastro for your particular system), allows
the builder to change the name of programs with the three options
@option{--program-prefix}, @option{--program-suffix} and
@option{--program-transform-name}.
+The first two are for adding a fixed prefix or suffix to all the programs that
will be installed.
+This will actually make all the names longer! You can use it to add versions
of program names to the programs in order to simultaneously have two executable
versions of a program.
+@cindex SED, stream editor
+@cindex Stream editor, SED
+The third configure option allows you to set the executable name at install
time using the SED program.
+SED is a very useful `stream editor'.
+There are various resources on the internet to use it effectively.
+However, we should caution that using configure options will change the actual
executable name of the installed program and on every re-install (an update for
example), you have to also add this option to keep the old executable name
updated.
+Also note that the documentation or configuration files do not change from
their standard names either.
+@cindex Removing @file{ast} from executables
+For example, let's assume that typing @file{ast} on every invocation of every
program is really annoying you! You can remove this prefix from all the
executables at configure time by adding this option:
+@example
+$ ./configure --program-transform-name='s/ast/ /'
+@end example
+@node Configure and build in RAM, , Executable names, Configuring
+@subsubsection Configure and build in RAM
-@node Common program behavior, Data containers, Installation, Top
-@chapter Common program behavior
+@cindex File I/O
+@cindex Input/Output, file
+Gnuastro's configure and build process (the GNU build system) involves the
creation, reading, and modification of a large number of files (input/output,
or I/O).
+Therefore file I/O issues can directly affect the work of developers who need
to configure and build Gnuastro numerous times.
+Some of these issues are listed below:
-All the programs in Gnuastro share a set of common behavior mainly to do with
user interaction to facilitate their usage and development.
-This includes how to feed input datasets into the programs, how to configure
them, specifying the outputs, numerical data types, treating columns of
information in tables, etc.
-This chapter is devoted to describing this common behavior in all programs.
-Because the behaviors discussed here are common to several programs, they are
not repeated in each program's description.
+@itemize
+@cindex HDD
+@cindex SSD
+@item
+I/O will cause wear and tear on both the HDDs (mechanical failures) and
+SSDs (decreasing the lifetime).
-In @ref{Command-line}, a very general description of running the programs on
the command-line is discussed, like difference between arguments and options,
as well as options that are common/shared between all programs.
-None of Gnuastro's programs keep any internal configuration value (values for
their different operational steps), they read their configuration primarily
from the command-line, then from specific files in directory, user, or
system-wide settings.
-Using these configuration files can greatly help reproducible and robust usage
of Gnuastro, see @ref{Configuration files} for more.
+@cindex Backup
+@item
+Having the built files mixed with the source files can greatly affect backing
up (synchronization) of source files (since it involves the management of a
large number of small files that are regularly changed.
+Backup software can of course be configured to ignore the built files and
directories.
+However, since the built files are mixed with the source files and can have a
large variety, this will require a high level of customization.
+@end itemize
-It is not possible to always have the different options and configurations of
each program on the top of your head.
-It is very natural to forget the options of a program, their current default
values, or how it should be run and what it did.
-Gnuastro's programs have multiple ways to help you refresh your memory in
multiple levels (just an option name, a short description, or fast access to
the relevant section of the manual.
-See @ref{Getting help} for more for more on benefiting from this very
convenient feature.
+@cindex tmpfs file system
+@cindex file systems, tmpfs
+One solution to address both these problems is to use the
@url{https://en.wikipedia.org/wiki/Tmpfs, tmpfs file system}.
+Any file in tmpfs is actually stored in the RAM (and possibly SWAP), not on
HDDs or SSDs.
+The RAM is built for extensive and fast I/O.
+Therefore the large number of file I/Os associated with configuring and
building will not harm the HDDs or SSDs.
+Due to the volatile nature of RAM, files in the tmpfs file-system will be
permanently lost after a power-off.
+Since all configured and built files are derivative files (not files that have
been directly written by hand) there is no problem in this and this feature can
be considered as an automatic cleanup.
-Many of the programs use the multi-threaded character of modern CPUs, in
@ref{Multi-threaded operations} we will discuss how you can configure this
behavior, along with some tips on making best use of them.
-In @ref{Numeric data types}, we will review the various types to store numbers
in your datasets: setting the proper type for the usage context@footnote{For
example, if the values in your dataset can only be integers between 0 or 65000,
store them in a unsigned 16-bit type, not 64-bit floating point type (which is
the default in most systems).
-It takes four times less space and is much faster to process.} can greatly
improve the file size and also speed of reading, writing or processing them.
+@cindex Linux kernel
+@cindex GNU C library
+@cindex GNU build system
+The modern GNU C library (and thus the Linux kernel) defines the
@file{/dev/shm} directory for this purpose in the RAM (POSIX shared memory).
+To build in it, you can use the GNU build system's ability to build in a
separate directory (not necessarily in the source directory) as shown below.
+Just set @file{SRCDIR} as the address of Gnuastro's top source directory (for
example, where there is the unpacked tarball).
-We will then look into the recognized table formats in @ref{Tables} and how
large datasets are broken into tiles, or mesh grid in @ref{Tessellation}.
-Finally, we will take a look at the behavior regarding output files:
@ref{Automatic output} describes how the programs set a default name for their
output when you do not give one explicitly (using @option{--output}).
-When the output is a FITS file, all the programs also store some very useful
information in the header that is discussed in @ref{Output FITS files}.
+@example
+$ SRCDIR=/home/username/gnuastro
+$ mkdir /dev/shm/tmp-gnuastro-build
+$ cd /dev/shm/tmp-gnuastro-build
+$ $SRCDIR/configure --srcdir=$SRCDIR
+$ make
+@end example
-@menu
-* Command-line:: How to use the command-line.
-* Configuration files:: Values for unspecified variables.
-* Getting help:: Getting more information on the go.
-* Multi-threaded operations:: How threads are managed in Gnuastro.
-* Numeric data types:: Different types and how to specify them.
-* Memory management:: How memory is allocated (in RAM or HDD/SSD).
-* Tables:: Recognized table formats.
-* Tessellation:: Tile the dataset into non-overlapping bins.
-* Automatic output:: About automatic output names.
-* Output FITS files:: Common properties when outputs are FITS.
-* Numeric locale:: Decimal point printed like 0.5 instead of 0,5.
-@end menu
+Gnuastro comes with a script to simplify this process of configuring and
building in a different directory (a ``clean'' build), for more see
@ref{Separate build and source directories}.
-@node Command-line, Configuration files, Common program behavior, Common
program behavior
-@section Command-line
+@node Separate build and source directories, Tests, Configuring, Build and
install
+@subsection Separate build and source directories
-Gnuastro's programs are customized through the standard Unix-like command-line
environment and GNU style command-line options.
-Both are very common in many Unix-like operating system programs.
-In @ref{Arguments and options} we will start with the difference between
arguments and options and elaborate on the GNU style of options.
-Afterwards, in @ref{Common options}, we will go into the detailed list of all
the options that are common to all the programs in Gnuastro.
+The simple steps of @ref{Quick start} will mix the source and built files.
+This can cause inconvenience for developers or enthusiasts following the most
recent work (see @ref{Version controlled source}).
+The current section is mainly focused on this later group of Gnuastro users.
+If you just install Gnuastro on major releases (following
@ref{Announcements}), you can safely ignore this section.
-@menu
-* Arguments and options:: Different ways to specify inputs and
configuration.
-* Common options:: Options that are shared between all programs.
-* Shell TAB completion:: Customized TAB completion in Gnuastro.
-* Standard input:: Using output of another program as input.
-* Shell tips:: Useful tips and tricks for program usage.
-@end menu
+@cindex GNU build system
+When it is necessary to keep the source (which is under version control), but
not the derivative (built) files (after checking or installing), the best
solution is to keep the source and the built files in separate directories.
+One application of this is already discussed in @ref{Configure and build in
RAM}.
-@node Arguments and options, Common options, Command-line, Command-line
-@subsection Arguments and options
+To facilitate this process of configuring and building in a separate
directory, Gnuastro comes with the @file{developer-build} script.
+It is available in the top source directory and is @emph{not} installed.
+It will make a directory under a given top-level directory (given to
@option{--top-build-dir}) and build Gnuastro there.
+It thus keeps the source completely separated from the built files.
+For easy access to the built files, it also makes a symbolic link to the built
directory in the top source files called @file{build}.
-@cindex Shell
-@cindex Options to programs
-@cindex Command-line options
-@cindex Arguments to programs
-@cindex Command-line arguments
-When you type a command on the command-line, it is passed onto the shell (a
generic name for the program that manages the command-line) as a string of
characters.
-As an example, see the ``Invoking ProgramName'' sections in this manual for
some examples of commands with each program, like @ref{Invoking asttable},
@ref{Invoking astfits}, or @ref{Invoking aststatistics}.
+When running the developer-build script without any options in the Gnuastro's
top source directory, default values will be used for its configuration.
+As with Gnuastro's programs, you can inspect the default values with
@option{-P} (or @option{--printparams}, the output just looks a little
different here).
+The default top-level build directory is @file{/dev/shm}: the shared memory
directory in RAM on GNU/Linux systems as described in @ref{Configure and build
in RAM}.
-The shell then brakes up your string into separate @emph{tokens} or
@emph{words} using any @emph{metacharacters} (like white-space, tab,
@command{|}, @command{>} or @command{;}) that are in the string.
-On the command-line, the first thing you usually enter is the name of the
program you want to run.
-After that, you can specify two types of tokens: @emph{arguments} and
@emph{options}.
-In the GNU-style, arguments are those tokens that are not preceded by any
hyphens (@command{-}, see @ref{Arguments}).
-Here is one example:
+@cindex Debug
+Besides these, it also has some features to facilitate the job of developers
or bleeding edge users like the @option{--debug} option to do a fast build,
with debug information, no optimization, and no shared libraries.
+Here is the full list of options you can feed to this script to configure its
operations.
-@example
-$ astcrop --center=53.162551,-27.789676 -w10/3600 --mode=wcs udf.fits
-@end example
+@cartouche
+@noindent
+@strong{Not all Gnuastro's common program behavior usable here:}
+@file{developer-build} is just a non-installed script with a very limited
scope as described above.
+It thus does not have all the common option behaviors or configuration files
for example.
+@end cartouche
-In the example above, we are running @ref{Crop} to crop a region of width 10
arc-seconds centered at the given RA and Dec from the input Hubble Ultra-Deep
Field (UDF) FITS image.
-Here, the argument is @file{udf.fits}.
-Arguments are most commonly the input file names containing your data.
-Options start with one or two hyphens, followed by an identifier for the
option (the option's name, for example, @option{--center}, @option{-w},
@option{--mode} in the example above) and its value (anything after the option
name, or the optional @key{=} character).
-Through options you can configure how the program runs (interprets the data
you provided).
+@cartouche
+@noindent
+@strong{White space between option and value:} @file{developer-build}
+does not accept an @key{=} sign between the options and their values.
+It also needs at least one character between the option and its value.
+Therefore @option{-n 4} or @option{--numthreads 4} are acceptable, while
@option{-n4}, @option{-n=4}, or @option{--numthreads=4} are not.
+Finally multiple short option names cannot be merged: for example, you can say
@option{-c -n 4}, but unlike Gnuastro's programs, @option{-cn4} is not
acceptable.
+@end cartouche
-@vindex --help
-@vindex --usage
-@cindex Mandatory arguments
-Arguments can be mandatory and optional and unlike options, they do not have
any identifiers.
-Hence, when there multiple arguments, their order might also matter (for
example, in @command{cp} which is used for copying one file to another
location).
-The outputs of @option{--usage} and @option{--help} shows which arguments are
optional and which are mandatory, see @ref{--usage}.
+@cartouche
+@noindent
+@strong{Reusable for other packages:} This script can be used in any software
which is configured and built using the GNU Build System.
+Just copy it in the top source directory of that software and run it from
there.
+@end cartouche
-As their name suggests, @emph{options} can be considered to be optional and
most of the time, you do not have to worry about what order you specify them in.
-When the order does matter, or the option can be invoked multiple times, it is
explicitly mentioned in the ``Invoking ProgramName'' section of each program
(this is a very important aspect of an option).
+@cartouche
+@noindent
+@strong{Example usage:} See @ref{Forking tutorial} for an example usage of
this script in some scenarios.
+@end cartouche
-@cindex Metacharacters on the command-line In case your arguments or option
values contain any of the shell's meta-characters, you have to quote them.
-If there is only one such character, you can use a backslash (@command{\})
before it.
-If there are multiple, it might be easier to simply put your whole argument or
option value inside of double quotes (@command{"}).
-In such cases, everything inside the double quotes will be seen as one token
or word.
+@table @option
-@cindex HDU
-@cindex Header data unit
-For example, let's say you want to specify the header data unit (HDU) of your
FITS file using a complex expression like `@command{3; images(exposure > 100)}'.
-If you simply add these after the @option{--hdu} (@option{-h}) option, the
programs in Gnuastro will read the value to the HDU option as `@command{3}' and
run.
-Then, the shell will attempt to run a separate command
`@command{images(exposure > 100)}' and complain about a syntax error.
-This is because the semicolon (@command{;}) is an `end of command' character
in the shell.
-To solve this problem you can simply put double quotes around the whole string
you want to pass to @option{--hdu} as seen below:
-@example
-$ astcrop --hdu="3; images(exposure > 100)" image.fits
-@end example
+@item -b STR
+@itemx --top-build-dir STR
+The top build directory to make a directory for the build.
+If this option is not called, the top build directory is @file{/dev/shm} (only
available in GNU/Linux operating systems, see @ref{Configure and build in RAM}).
+@item -V
+@itemx --version
+Print the version string of Gnuastro that will be used in the build.
+This string will be appended to the directory name containing the built files.
+@item -a
+@itemx --autoreconf
+Run @command{autoreconf -f} before building the package.
+In Gnuastro, this is necessary when a new commit has been made to the project
history.
+In Gnuastro's build system, the Git description will be used as the version,
see @ref{Version numbering} and @ref{Synchronizing}.
+@item -c
+@itemx --clean
+@cindex GNU Autoreconf
+Delete the contents of the build directory (clean it) before starting the
configuration and building of this run.
+This is useful when you have recently pulled changes from the main Git
repository, or committed a change yourself and ran @command{autoreconf -f}, see
@ref{Synchronizing}.
+After running GNU Autoconf, the version will be updated and you need to do a
clean build.
-@menu
-* Arguments:: For specifying the main input files/operations.
-* Options:: For configuring the behavior of the program.
-@end menu
+@item -d
+@itemx --debug
+@cindex Valgrind
+@cindex GNU Debugger (GDB)
+Build with debugging flags (for example, to use in GNU Debugger, also known as
GDB, or Valgrind), disable optimization and also the building of shared
libraries.
+Similar to running the configure script of below
-@node Arguments, Options, Arguments and options, Arguments and options
-@subsubsection Arguments
-In Gnuastro, arguments are almost exclusively used as the input data file
names.
-Please consult the first few paragraph of the ``Invoking ProgramName'' section
for each program for a description of what it expects as input, how many
arguments, or input data, it accepts, or in what order.
-Everything particular about how a program treats arguments, is explained under
the ``Invoking ProgramName'' section for that program.
+@example
+$ ./configure --enable-debug
+@end example
-@cindex Filename suffix
-@cindex Suffix (filename)
-@cindex FITS filename suffixes
-Generally, if there is a standard file name suffix for a particular format,
that filename extension is checked to identify their format.
-In astronomy (and thus Gnuastro), FITS is the preferred format for inputs and
outputs, so the focus here and throughout this book is on FITS.
-However, other formats are also accepted in special cases, for example,
@ref{ConvertType} also accepts JPEG or TIFF inputs, and writes JPEG, EPS or PDF
files.
-The recognized suffixes for these formats are listed there.
+Besides all the debugging advantages of building with this option, it will
also be significantly speed up the build (at the cost of slower built programs).
+So when you are testing something small or working on the build system itself,
it will be much faster to test your work with this option.
-The list below shows the recognized suffixes for FITS data files in Gnuastro's
programs.
-However, in some scenarios FITS writers may not append a suffix to the file,
or use a non-recognized suffix (not in the list below).
-Therefore if a FITS file is expected, but it does not have any of these
suffixes, Gnuastro programs will look into the contents of the file and if it
does conform with the FITS standard, the file will be used.
-Just note that checking about 5 characters at the end of a name string is much
more efficient than opening and checking the contents of a file, so it is
generally recommended to have a recognized FITS suffix.
+@item -v
+@itemx --valgrind
+@cindex Valgrind
+Build all @command{make check} tests within Valgrind.
+For more, see the description of @option{--enable-check-with-valgrind} in
@ref{Gnuastro configure options}.
-@itemize
+@item -j INT
+@itemx --jobs INT
+The maximum number of threads/jobs for Make to build at any moment.
+As the name suggests (Make has an identical option), the number given to this
option is directly passed on to any call of Make with its @option{-j} option.
-@item
-@file{.fits}: The standard file name ending of a FITS image.
+@item -C
+@itemx --check
+After finishing the build, also run @command{make check}.
+By default, @command{make check} is not run because the developer usually has
their own checks to work on (for example, defined in
@file{tests/during-dev.sh}).
-@item
-@file{.fit}: Alternative (3 character) FITS suffix.
+@item -i
+@itemx --install
+After finishing the build, also run @command{make install}.
-@item
-@file{.fits.Z}: A FITS image compressed with @command{compress}.
+@item -D
+@itemx --dist
+Run @code{make dist-lzip pdf} to build a distribution tarball (in
@file{.tar.lz} format) and a PDF manual.
+This can be useful for archiving, or sending to colleagues who do not use Git
for an easy build and manual.
-@item
-@file{.fits.gz}: A FITS image compressed with GNU zip (gzip).
+@item -u STR
+@item --upload STR
+Activate the @option{--dist} (@option{-D}) option, then use secure copy
(@command{scp}, part of the SSH tools) to copy the tarball and PDF to the
@file{src} and @file{pdf} sub-directories of the specified server and its
directory (value to this option).
+For example, @command{--upload my-server:dir}, will copy the tarball in the
@file{dir/src}, and the PDF manual in @file{dir/pdf} of @code{my-server} server.
+It will then make a symbolic link in the top server directory to the tarball
that is called @file{gnuastro-latest.tar.lz}.
-@item
-@file{.fits.fz}: A FITS image compressed with @command{fpack}.
+@item -p STR
+@itemx --publish=STR
+Clean, bootstrap, build, check and upload the checked tarball and PDF of the
book to the URL given as @code{STR}.
+This option is just a wrapper for @option{--autoreconf --clean --debug --check
--upload STR}.
+@option{--debug} is added because it will greatly speed up the build.
+@option{--debug} will have no effect on the produced tarball (people who later
download will be building with the default optimized, and non-debug mode).
+This option is good when you have made a commit and are ready to publish it on
your server (if nothing crashes).
+Recall that if any of the previous steps fail the script aborts.
-@item
-@file{.imh}: IRAF format image file.
+@item -I
+@item --install-archive
+Short for @option{--autoreconf --clean --check --install --dist}.
+This is useful when you actually want to install the commit you just made (if
the build and checks succeed).
+It will also produce a distribution tarball and PDF manual for easy access to
the installed tarball on your system at a later time.
-@end itemize
+Ideally, Gnuastro's Git version history makes it easy for a prepared system to
revert back to a different point in history.
+But Gnuastro also needs to bootstrap files and also your collaborators might
(usually do!) find it too much of a burden to do the bootstrapping themselves.
+So it is convenient to have a tarball and PDF manual of the version you have
installed (and are using in your research) handily available.
-Through out this book and in the command-line outputs, whenever we want to
generalize all such astronomical data formats in a text place-holder, we will
use @file{ASTRdata}, we will assume that the extension is also part of this
name.
-Any file ending with these names is directly passed on to CFITSIO to read.
-Therefore you do not necessarily have to have these files on your computer,
they can also be located on an FTP or HTTP server too, see the CFITSIO manual
for more information.
+@item -h
+@itemx --help
+@itemx -P
+@itemx --printparams
+Print a description of this script along with all the options and their
+current values.
-CFITSIO has its own error reporting techniques, if your input file(s) cannot
be opened, or read, those errors will be printed prior to the final error by
Gnuastro.
+@end table
-@node Options, , Arguments, Arguments and options
-@subsubsection Options
+@node Tests, A4 print book, Separate build and source directories, Build and
install
+@subsection Tests
-@cindex GNU style options
-@cindex Options, GNU style
-@cindex Options, short (@option{-}) and long (@option{--})
-Command-line options allow configuring the behavior of a program in all
GNU/Linux applications for each particular execution on a particular input data.
-A single option can be called in two ways: @emph{long} or @emph{short}.
-All options in Gnuastro accept the long format which has two hyphens an can
have many characters (for example, @option{--hdu}).
-Short options only have one hyphen (@key{-}) followed by one character (for
example, @option{-h}).
-You can see some examples in the list of options in @ref{Common options} or
those for each program's ``Invoking ProgramName'' section.
-Both formats are shown for those which support both.
-First the short is shown then the long.
+@cindex @command{make check}
+@cindex @file{mock.fits}
+@cindex Tests, running
+@cindex Checking tests
+After successfully building (compiling) the programs with the @command{$ make}
command you can check the installation before installing.
+To run the tests, run
-Usually, the short options are for when you are writing on the command-line
and want to save keystrokes and time.
-The long options are good for shell scripts, where you are not usually rushing.
-Long options provide a level of documentation, since they are more descriptive
and less cryptic.
-Usually after a few months of not running a program, the short options will be
forgotten and reading your previously written script will not be easy.
+@example
+$ make check
+@end example
-@cindex On/Off options
-@cindex Options, on/off
-Some options need to be given a value if they are called and some do not.
-You can think of the latter type of options as on/off options.
-These two types of options can be distinguished using the output of the
@option{--help} and @option{--usage} options, which are common to all GNU
software, see @ref{Getting help}.
-In Gnuastro we use the following strings to specify when the option needs a
value and what format that value should be in.
-More specific tests will be done in the program and if the values are out of
range (for example, negative when the program only wants a positive value), an
error will be reported.
+For every program some tests are designed to check some possible operations.
+Running the command above will run those tests and give you a final report.
+If everything is OK and you have built all the programs, all the tests should
pass.
+In case any of the tests fail, please have a look at @ref{Known issues} and if
that still does not fix your problem, look that the
@file{./tests/test-suite.log} file to see if the source of the error is
something particular to your system or more general.
+If you feel it is general, please contact us because it might be a bug.
+Note that the tests of some programs depend on the outputs of other program's
tests, so if you have not installed them they might be skipped or fail.
+Prior to releasing every distribution all these tests are checked.
+If you have a reasonably modern terminal, the outputs of the successful tests
will be colored green and the failed ones will be colored red.
-@vtable @option
+These scripts can also act as a good set of examples for you to see how the
programs are run.
+All the tests are in the @file{tests/} directory.
+The tests for each program are shell scripts (ending with @file{.sh}) in a
sub-directory of this directory with the same name as the program.
+See @ref{Test scripts} for more detailed information about these scripts in
case you want to inspect them.
-@item INT
-The value is read as an integer.
-@item FLT
-The value is read as a float.
-There are generally two types, depending on the context.
-If they are for fractions, they will have to be less than or equal to unity.
-@item STR
-The value is read as a string of characters.
-For example, column names in a table, or HDU names in a multi-extension FITS
file.
-Other examples include human-readable settings by some programs like the
@option{--domain} option of the Convolve program that can be either
@code{spatial} or @code{frequency} (to specify the type of convolution, see
@ref{Convolve}).
-@item FITS @r{or} FITS/TXT
-The value should be a file (most commonly FITS).
-In many cases, other formats may also be accepted (for example, input tables
can be FITS or plain-text, see @ref{Recognized table formats}).
+@node A4 print book, Known issues, Tests, Build and install
+@subsection A4 print book
-@end vtable
+@cindex A4 print book
+@cindex Modifying print book
+@cindex A4 paper size
+@cindex US letter paper size
+@cindex Paper size, A4
+@cindex Paper size, US letter
+The default print version of this book is provided in the letter paper size.
+If you would like to have the print version of this book on paper and you are
living in a country which uses A4, then you can rebuild the book.
+The great thing about the GNU build system is that the book source code which
is in Texinfo is also distributed with the program source code, enabling you to
do such customization (hacking).
+
+@cindex GNU Texinfo
+In order to change the paper size, you will need to have GNU Texinfo installed.
+Open @file{doc/gnuastro.texi} with any text editor.
+This is the source file that created this book.
+In the first few lines you will see this line:
+
+@example
+@@c@@afourpaper
+@end example
@noindent
-@cindex Values to options
-@cindex Option values
-To specify a value in the short format, simply put the value after the option.
-Note that since the short options are only one character long, you do not have
to type anything between the option and its value.
-For the long option you either need white space or an @option{=} sign, for
example, @option{-h2}, @option{-h 2}, @option{--hdu 2} or @option{--hdu=2} are
all equivalent.
+In Texinfo, a line is commented with @code{@@c}.
+Therefore, un-comment this line by deleting the first two characters such that
it changes to:
-The short format of on/off options (those that do not need values) can be
concatenated for example, these two hypothetical sequences of options are
equivalent: @option{-a -b -c4} and @option{-abc4}.
-As an example, consider the following command to run Crop:
@example
-$ astcrop -Dr3 --wwidth 3 catalog.txt --deccol=4 ASTRdata
+@@afourpaper
@end example
+
@noindent
-The @command{$} is the shell prompt, @command{astcrop} is the program name.
-There are two arguments (@command{catalog.txt} and @command{ASTRdata}) and
four options, two of them given in short format (@option{-D}, @option{-r}) and
two in long format (@option{--width} and @option{--deccol}).
-Three of them require a value and one (@option{-D}) is an on/off option.
+Save the file and close it.
+You can now run the following command
-@vindex --printparams
-@cindex Options, abbreviation
-@cindex Long option abbreviation
-If an abbreviation is unique between all the options of a program, the long
option names can be abbreviated.
-For example, instead of typing @option{--printparams}, typing @option{--print}
or maybe even @option{--pri} will be enough, if there are conflicts, the
program will warn you and show you the alternatives.
-Finally, if you want the argument parser to stop parsing arguments beyond a
certain point, you can use two dashes: @option{--}.
-No text on the command-line beyond these two dashes will be parsed.
+@example
+$ make pdf
+@end example
-@cindex Repeated options
-@cindex Options, repeated
-Gnuastro has two types of options with values, those that only take a single
value are the most common type.
-If these options are repeated or called more than once on the command-line,
the value of the last time it was called will be assigned to it.
-This is very useful when you are testing/experimenting.
-Let's say you want to make a small modification to one option value.
-You can simply type the option with a new value in the end of the command and
see how the script works.
-If you are satisfied with the change, you can remove the original option for
human readability.
-If the change was not satisfactory, you can remove the one you just added and
not worry about forgetting the original value.
-Without this capability, you would have to memorize or save the original value
somewhere else, run the command and then change the value again which is not at
all convenient and is potentially cause lots of bugs.
+@noindent
+and the new PDF book will be available in @file{SRCdir/doc/gnuastro.pdf}.
+By changing the @command{pdf} in @command{$ make pdf} to @command{ps} or
@command{dvi} you can have the book in those formats.
+Note that you can do this for any book that is in Texinfo format, they might
not have @code{@@afourpaper} line, so you can add it close to the top of the
Texinfo source file.
-On the other hand, some options can be called multiple times in one run of a
program and can thus take multiple values (for example, see the
@option{--column} option in @ref{Invoking asttable}.
-In these cases, the order of stored values is the same order that you
specified on the command-line.
-@cindex Configuration files
-@cindex Default option values
-Gnuastro's programs do not keep any internal default values, so some options
are mandatory and if they do not have a value, the program will complain and
abort.
-Most programs have many such options and typing them by hand on every call is
impractical.
-To facilitate the user experience, after parsing the command-line, Gnuastro's
programs read special configuration files to get the necessary values for the
options you have not identified on the command-line.
-These configuration files are fully described in @ref{Configuration files}.
-@cartouche
-@noindent
-@cindex Tilde expansion as option values
-@strong{CAUTION:} In specifying a file address, if you want to use the shell's
tilde expansion (@command{~}) to specify your home directory, leave at least
one space between the option name and your value.
-For example, use @command{-o ~/test}, @command{--output ~/test} or
@command{--output= ~/test}.
-Calling them with @command{-o~/test} or @command{--output=~/test} will disable
shell expansion.
-@end cartouche
-@cartouche
-@noindent
-@strong{CAUTION:} If you forget to specify a value for an option which
requires one, and that option is the last one, Gnuastro will warn you.
-But if it is in the middle of the command, it will take the text of the next
option or argument as the value which can cause undefined behavior.
-@end cartouche
-@cartouche
-@noindent
-@cindex Counting from zero.
-@strong{NOTE:} In some contexts Gnuastro's counting starts from 0 and in
others 1.
-You can assume by default that counting starts from 1, if it starts from 0 for
a special option, it will be explicitly mentioned.
-@end cartouche
-@node Common options, Shell TAB completion, Arguments and options, Command-line
-@subsection Common options
+@node Known issues, , A4 print book, Build and install
+@subsection Known issues
-@cindex Options common to all programs
-@cindex Gnuastro common options
-To facilitate the job of the users and developers, all the programs in
Gnuastro share some basic command-line options for the options that are common
to many of the programs.
-The full list is classified as @ref{Input output options}, @ref{Processing
options}, and @ref{Operating mode options}.
-In some programs, some of the options are irrelevant, but still recognized
(you will not get an unrecognized option error, but the value is not used).
-Unless otherwise mentioned, these options are identical between all programs.
+Depending on your operating system and the version of the compiler you are
using, you might confront some known problems during the configuration
(@command{$ ./configure}), compilation (@command{$ make}) and tests (@command{$
make check}).
+Here, their solutions are discussed.
-@menu
-* Input output options:: Common input/output options.
-* Processing options:: Options for common processing steps.
-* Operating mode options:: Common operating mode options.
-@end menu
+@itemize
+@cindex Configuration, not finding library
+@cindex Development packages
+@item
+@command{$ ./configure}: @emph{Configure complains about not finding a library
even though you have installed it.}
+The possible solution is based on how you installed the package:
-@node Input output options, Processing options, Common options, Common options
-@subsubsection Input/Output options
+@itemize
+@item
+From your distribution's package manager.
+Most probably this is because your distribution has separated the header files
of a library from the library parts.
+Please also install the `development' packages for those libraries too.
+Just add a @file{-dev} or @file{-devel} to the end of the package name and
re-run the package manager.
+This will not happen if you install the libraries from source.
+When installed from source, the headers are also installed.
-These options are to do with the input and outputs of the various
-programs.
+@item
+@cindex @command{LDFLAGS}
+From source.
+Then your linker is not looking where you installed the library.
+If you followed the instructions in this chapter, all the libraries will be
installed in @file{/usr/local/lib}.
+So you have to tell your linker to look in this directory.
+To do so, configure Gnuastro like this:
-@vtable @option
+@example
+$ ./configure LDFLAGS="-L/usr/local/lib"
+@end example
-@cindex Timeout
-@cindex Standard input
-@item --stdintimeout
-Number of micro-seconds to wait for writing/typing in the @emph{first line} of
standard input from the command-line (see @ref{Standard input}).
-This is only relevant for programs that also accept input from the standard
input, @emph{and} you want to manually write/type the contents on the terminal.
-When the standard input is already connected to a pipe (output of another
program), there will not be any waiting (hence no timeout, thus making this
option redundant).
+If you want to use the libraries for your other programming projects, then
+export this environment variable in a start-up script similar to the case
+for @file{LD_LIBRARY_PATH} explained below, also see @ref{Installation
+directory}.
+@end itemize
-If the first line-break (for example, with the @key{ENTER} key) is not
provided before the timeout, the program will abort with an error that no input
was given.
-Note that this time interval is @emph{only} for the first line that you type.
-Once the first line is given, the program will assume that more data will come
and accept rest of your inputs without any time limit.
-You need to specify the ending of the standard input, for example, by pressing
@key{CTRL-D} after a new line.
+@item
+@vindex --enable-gnulibcheck
+@cindex Gnulib: GNU Portability Library
+@cindex GNU Portability Library (Gnulib)
+@command{$ make}: @emph{Complains about an unknown function on a non-GNU based
operating system.}
+In this case, please run @command{$ ./configure} with the
@option{--enable-gnulibcheck} option to see if the problem is from the GNU
Portability Library (Gnulib) not supporting your system or if there is a
problem in Gnuastro, see @ref{Gnuastro configure options}.
+If the problem is not in Gnulib and after all its tests you get the same
complaint from @command{make}, then please contact us at
@file{bug-gnuastro@@gnu.org}.
+The cause is probably that a function that we have used is not supported by
your operating system and we did not included it along with the source tarball.
+If the function is available in Gnulib, it can be fixed immediately.
-Note that any input you write/type into a program on the command-line with
Standard input will be discarded (lost) once the program is finished.
-It is only recoverable manually from your command-line (where you actually
typed) as long as the terminal is open.
-So only use this feature when you are sure that you do not need the dataset
(or have a copy of it somewhere else).
+@item
+@cindex @command{CPPFLAGS}
+@command{$ make}: @emph{Cannot find the headers (.h files) of installed
libraries.}
+Your C preprocessor (CPP) is not looking in the right place.
+To fix this, configure Gnuastro with an additional @code{CPPFLAGS} like below
(assuming the library is installed in @file{/usr/local/include}:
+@example
+$ ./configure CPPFLAGS="-I/usr/local/include"
+@end example
-@cindex HDU
-@cindex Header data unit
-@item -h STR/INT
-@itemx --hdu=STR/INT
-The name or number of the desired Header Data Unit, or HDU, in the FITS image.
-A FITS file can store multiple HDUs or extensions, each with either an image
or a table or nothing at all (only a header).
-Note that counting of the extensions starts from 0(zero), not 1(one).
-Counting from 0 is forced on us by CFITSIO which directly reads the value you
give with this option (see @ref{CFITSIO}).
-When specifying the name, case is not important so @command{IMAGE},
@command{image} or @command{ImAgE} are equivalent.
+If you want to use the libraries for your other programming projects, then
export this environment variable in a start-up script similar to the case for
@file{LD_LIBRARY_PATH} explained below, also see @ref{Installation directory}.
-CFITSIO has many capabilities to help you find the extension you want, far
beyond the simple extension number and name.
-See CFITSIO manual's ``HDU Location Specification'' section for a very
complete explanation with several examples.
-A @code{#} is appended to the string you specify for the HDU@footnote{With the
@code{#} character, CFITSIO will only read the desired HDU into your memory,
not all the existing HDUs in the fits file.} and the result is put in square
brackets and appended to the FITS file name before calling CFITSIO to read the
contents of the HDU for all the programs in Gnuastro.
+@cindex Tests, only one passes
+@cindex @file{LD_LIBRARY_PATH}
+@item
+@command{$ make check}: @emph{Only the first couple of tests pass, all the
rest fail or get skipped.} It is highly likely that when searching for shared
libraries, your system does not look into the @file{/usr/local/lib} directory
(or wherever you installed Gnuastro or its dependencies).
+To make sure it is added to the list of directories, add the following line to
your @file{~/.bashrc} file and restart your terminal.
+Do Not forget to change @file{/usr/local/lib} if the libraries are installed
in other (non-standard) directories.
-@cartouche
-@noindent
-@strong{Default HDU is HDU number 1 (counting from 0):} by default, Gnuastro’s
programs assume that their (main/first) input is in HDU number 1 (counting from
zero).
-So if you don’t specify the HDU number, the program will read the input from
this HDU.
-For programs that can take multiple FITS datasets as input (like
@ref{Arithmetic}) this default HDU applies to the first input, you still need
to call @option{--hdu} for the other inputs.
-Generally, all Gnuastro's programs write their outputs in HDU number 1 (HDU 0
is reserved for metadata like the configuration parameters that the program was
run with).
-For more on this, see @ref{Fits}.
-@end cartouche
+@example
+export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib"
+@end example
-@item -s STR
-@itemx --searchin=STR
-Where to match/search for columns when the column identifier was not a number,
see @ref{Selecting table columns}.
-The acceptable values are @command{name}, @command{unit}, or @command{comment}.
-This option is only relevant for programs that take table columns as input.
+You can also add more directories by using a colon `@code{:}' to separate them.
+See @ref{Installation directory} and @ref{Linking} to learn more on the
@code{PATH} variables and dynamic linking respectively.
-@item -I
-@itemx --ignorecase
-Ignore case while matching/searching column meta-data (in the field specified
by the @option{--searchin}).
-The FITS standard suggests to treat the column names as case insensitive,
which is strongly recommended here also but is not enforced.
-This option is only relevant for programs that take table columns as input.
+@cindex GPL Ghostscript
+@item
+@command{$ make check}: @emph{The tests relying on external programs (for
example, @file{fitstopdf.sh} fail.)} This is probably due to the fact that the
version number of the external programs is too old for the tests we have
preformed.
+Please update the program to a more recent version.
+For example, to create a PDF image, you will need GPL Ghostscript, but older
versions do not work, we have successfully tested it on version 9.15.
+Older versions might cause a failure in the test result.
-This option is not relevant to @ref{BuildProgram}, hence in that program the
short option @option{-I} is used for include directories, not to ignore case.
+@item
+@cindex @TeX{}
+@cindex GNU Texinfo
+@command{$ make pdf}: @emph{The PDF book cannot be made.}
+To make a PDF book, you need to have the GNU Texinfo program (like any
program, the more recent the better).
+A working @TeX{} program is also necessary, which you can get from Tex
Live@footnote{@url{https://www.tug.org/texlive/}}.
-@item -o STR
-@itemx --output=STR
-The name of the output file or directory. With this option the automatic
output names explained in @ref{Automatic output} are ignored.
+@item
+@cindex GNU Libtool
+After @code{make check}: do not copy the programs' executables to another (for
example, the installation) directory manually (using @command{cp}, or
@command{mv} for example).
+In the default configuration@footnote{If you configure Gnuastro with the
@option{--disable-shared} option, then the libraries will be statically linked
to the programs and this problem will not exist, see @ref{Linking}.}, the
program binaries need to link with Gnuastro's shared library which is also
built and installed with the programs.
+Therefore, to run successfully before and after installation, linking
modifications need to be made by GNU Libtool at installation time.
+@command{make install} does this internally, but a simple copy might give
linking errors when you run it.
+If you need to copy the executables, you can do so after installation.
-@item -T STR
-@itemx --type=STR
-The data type of the output depending on the program context.
-This option is not applicable to some programs like @ref{Fits} and will be
ignored by them.
-The different acceptable values to this option are fully described in
@ref{Numeric data types}.
+@cindex Tests, error in converting images
+@item
+@command{$ make} (when bootstrapping): After you have bootstrapped Gnuastro
from the version-controlled source, you may confront the following (or a
similar) error when converting images (for more on bootstrapping, see
@ref{Bootstrapping}):
-@item -D
-@itemx --dontdelete
-By default, if the output file already exists, Gnuastro's programs will
silently delete it and put their own outputs in its place.
-When this option is activated, if the output file already exists, the programs
will not delete it, will warn you, and will abort.
+@example
+@code{convert: attempt to perform an operation not allowed by the
+security policy `gs' @ error/delegate.c/ExternalDelegateCommand/378.}
+@end example
-@item -K
-@itemx --keepinputdir
-In automatic output names, do not remove the directory information of the
input file names.
-As explained in @ref{Automatic output}, if no output name is specified (with
@option{--output}), then the output name will be made in the existing directory
based on your input's file name (ignoring the directory of the input).
-If you call this option, the directory information of the input will be kept
and the automatically generated output name will be in the same directory as
the input (usually with a suffix added).
-Note that his is only relevant if you are running the program in a different
directory than the input data.
+This error is a known
issue@footnote{@url{https://wiki.archlinux.org/title/ImageMagick}} with
@code{ImageMagick} security policies in some operating systems.
+In short, @code{imagemagick} uses Ghostscript for PDF, EPS, PS and XPS parsing.
+However, because some security vulnerabilities have been found in
Ghostscript@footnote{@url{https://security.archlinux.org/package/ghostscript}},
by default, ImageMagick may be compiled without Ghostscript library.
+In such cases, if allowed, ImageMagick will fall back to the external
@command{gs} command instead of the library.
+But this may be disabled with the following (or a similar) lines in
@code{/etc/ImageMagick-7/policy.xml} (anything related to PDF, PS, or
Ghostscript).
-@item -t STR
-@itemx --tableformat=STR
-The output table's type.
-This option is only relevant when the output is a table and its format cannot
be deduced from its filename.
-For example, if a name ending in @file{.fits} was given to @option{--output},
then the program knows you want a FITS table.
-But there are two types of FITS tables: FITS ASCII, and FITS binary.
-Thus, with this option, the program is able to identify which type you want.
-The currently recognized values to this option are:
+@example
+<policy domain="delegate" rights="none" pattern="gs" />
+<policy domain="module" rights="none" pattern="@{PS,PDF,XPS@}" />
+@end example
-@item --wcslinearmatrix=STR
-Select the linear transformation matrix of the output's WCS.
-This option only takes two values: @code{pc} (for the @code{PCi_j} formalism)
and @code{cd} (for @code{CDi_j}).
-For more on the different formalisms, please see Section 8.1 of the FITS
standard@footnote{@url{https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf}},
version 4.0.
+To fix this problem, simply comment such lines (by placing a @code{<!--}
before each statement/line and @code{-->} at the end of that statement/line).
-@cindex @code{CDELT}
-In short, in the @code{PCi_j} formalism, we only keep the linear rotation
matrix in these keywords and put the scaling factor (or the pixel scale in
astronomical imaging) in the @code{CDELTi} keywords.
-In the @code{CDi_j} formalism, we blend the scaling into the rotation into a
single matrix and keep that matrix in these FITS keywords.
-By default, Gnuastro uses the @code{PCi_j} formalism, because it greatly helps
in human readability of the raw keywords and is also the default mode of WCSLIB.
-However, in some circumstances it may be necessary to have the keywords in the
CD format; for example, when you need to feed the outputs into other software
that do not follow the full FITS standard and only recognize the @code{CDi_j}
formalism.
+@end itemize
-@table @command
-@item txt
-A plain text table with white-space characters between the columns (see
-@ref{Gnuastro text table format}).
-@item fits-ascii
-A FITS ASCII table (see @ref{Recognized table formats}).
-@item fits-binary
-A FITS binary table (see @ref{Recognized table formats}).
-@end table
+@noindent
+If your problem was not listed above, please file a bug report (@ref{Report a
bug}).
-@end vtable
-@node Processing options, Operating mode options, Input output options, Common
options
-@subsubsection Processing options
-Some processing steps are common to several programs, so they are defined as
common options to all programs.
-Note that this class of common options is thus necessarily less common between
all the programs than those described in @ref{Input output options}, or
@ref{Operating mode options} options.
-Also, if they are irrelevant for a program, these options will not display in
the @option{--help} output of the program.
-@table @option
-@item --minmapsize=INT
-The minimum size (in bytes) to memory-map a processing/internal array as a
file (on the non-volatile HDD/SSD), and not use the system's RAM.
-Before using this option, please read @ref{Memory management}.
-By default processing arrays will only be memory-mapped to a file when the RAM
is full.
-With this option, you can force the memory-mapping, even when there is enough
RAM.
-To ensure this default behavior, the pre-defined value to this option is an
extremely large value (larger than any existing RAM).
-Please note that using a non-volatile file (in the HDD/SDD) instead of RAM can
significantly increase the program's running time, especially on HDDs (where
read/write is slower).
-Also, note that the number of memory-mapped files that your kernel can support
is limited.
-So when this option is necessary, it is best to give it values larger than 1
megabyte (@option{--minmapsize=1000000}).
-You can then decrease it for a specific program's invocation on a large input
after you see memory issues arise (for example, an error, or the program not
aborting and fully consuming your memory).
-If you see randomly named files remaining in this directory when the program
finishes normally, please send us a bug report so we address the problem, see
@ref{Report a bug}.
-@cartouche
-@noindent
-@strong{Limited number of memory-mapped files:} The operating system kernels
usually support a limited number of memory-mapped files.
-Therefore never set @code{--minmapsize} to zero or a small number of bytes (so
too many files are created).
-If the kernel capacity is exceeded, the program will crash.
-@end cartouche
-@item --quietmmap
-Do Not print any message when an array is stored in non-volatile memory
-(HDD/SSD) and not RAM, see the description of @option{--minmapsize} (above)
-for more.
-@item -Z INT[,INT[,...]]
-@itemx --tilesize=[,INT[,...]]
-The size of regular tiles for tessellation, see @ref{Tessellation}.
-For each dimension an integer length (in units of data-elements or pixels) is
necessary.
-If the number of input dimensions is different from the number of values given
to this option, the program will stop with an error.
-Values must be separated by commas (@key{,}) and can also be fractions (for
example, @code{4/2}).
-If they are fractions, the result must be an integer, otherwise an error will
be printed.
-@item -M INT[,INT[,...]]
-@itemx --numchannels=INT[,INT[,...]]
-The number of channels for larger input tessellation, see @ref{Tessellation}.
-The number and types of acceptable values are similar to @option{--tilesize}.
-The only difference is that instead of length, the integers values given to
this option represent the @emph{number} of channels, not their size.
-@item -F FLT
-@itemx --remainderfrac=FLT
-The fraction of remainder size along all dimensions to add to the first tile.
-See @ref{Tessellation} for a complete description.
-This option is only relevant if @option{--tilesize} is not exactly divisible
by the input dataset's size in a dimension.
-If the remainder size is larger than this fraction (compared to
@option{--tilesize}), then the remainder size will be added with one regular
tile size and divided between two tiles at the start and end of the given
dimension.
-@item --workoverch
-Ignore the channel borders for the high-level job of the given application.
-As a result, while the channel borders are respected in defining the small
tiles (such that no tile will cross a channel border), the higher-level program
operation will ignore them, see @ref{Tessellation}.
-@item --checktiles
-Make a FITS file with the same dimensions as the input but each pixel is
replaced with the ID of the tile that it is associated with.
-Note that the tile IDs start from 0.
-See @ref{Tessellation} for more on Tiling an image in Gnuastro.
-@item --oneelempertile
-When showing the tile values (for example, with @option{--checktiles}, or when
the program's output is tessellated) only use one element for each tile.
-This can be useful when only the relative values given to each tile compared
to the rest are important or need to be checked.
-Since the tiles usually have a large number of pixels within them the output
will be much smaller, and so easier to read, write, store, or send.
-Note that when the full input size in any dimension is not exactly divisible
by the given @option{--tilesize} in that dimension, the edge tile(s) will have
different sizes (in units of the input's size), see @option{--remainderfrac}.
-But with this option, all displayed values are going to have the (same) size
of one data-element.
-Hence, in such cases, the image proportions are going to be slightly different
with this option.
-If your input image is not exactly divisible by the tile size and you want one
value per tile for some higher-level processing, all is not lost though.
-You can see how many pixels were within each tile (for example, to weight the
values or discard some for later processing) with Gnuastro's Statistics (see
@ref{Statistics}) as shown below.
-The output FITS file is going to have two extensions, one with the median
calculated on each tile and one with the number of elements that each tile
covers.
-You can then use the @code{where} operator in @ref{Arithmetic} to set the
values of all tiles that do not have the regular area to a blank value.
-@example
-$ aststatistics --median --number --ontile input.fits \
- --oneelempertile --output=o.fits
-$ REGULAR_AREA=1600 # Check second extension of `o.fits'.
-$ astarithmetic o.fits o.fits $REGULAR_AREA ne nan where \
- -h1 -h2
-@end example
-Note that if @file{input.fits} also has blank values, then the median on
-tiles with blank values will also be ignored with the command above (which
-is desirable).
+@node Common program behavior, Data containers, Installation, Top
+@chapter Common program behavior
+All the programs in Gnuastro share a set of common behavior mainly to do with
user interaction to facilitate their usage and development.
+This includes how to feed input datasets into the programs, how to configure
them, specifying the outputs, numerical data types, treating columns of
information in tables, etc.
+This chapter is devoted to describing this common behavior in all programs.
+Because the behaviors discussed here are common to several programs, they are
not repeated in each program's description.
-@item --inteponlyblank
-When values are to be interpolated, only change the values of the blank
-elements, keep the non-blank elements untouched.
+In @ref{Command-line}, a very general description of running the programs on
the command-line is discussed, like difference between arguments and options,
as well as options that are common/shared between all programs.
+None of Gnuastro's programs keep any internal configuration value (values for
their different operational steps), they read their configuration primarily
from the command-line, then from specific files in directory, user, or
system-wide settings.
+Using these configuration files can greatly help reproducible and robust usage
of Gnuastro, see @ref{Configuration files} for more.
-@item --interpmetric=STR
-@cindex Radial metric
-@cindex Taxicab metric
-@cindex Manhattan metric
-@cindex Metric: Manhattan, Taxicab, Radial
-The metric to use for finding nearest neighbors.
-Currently it only accepts the Manhattan (or taxicab) metric with
@code{manhattan}, or the radial metric with @code{radial}.
+It is not possible to always have the different options and configurations of
each program on the top of your head.
+It is very natural to forget the options of a program, their current default
values, or how it should be run and what it did.
+Gnuastro's programs have multiple ways to help you refresh your memory in
multiple levels (just an option name, a short description, or fast access to
the relevant section of the manual.
+See @ref{Getting help} for more for more on benefiting from this very
convenient feature.
-The Manhattan distance between two points is defined with
@mymath{|\Delta{x}|+|\Delta{y}|}.
-Thus the Manhattan metric has the advantage of being fast, but at the expense
of being less accurate.
-The radial distance is the standard definition of distance in a Euclidean
space: @mymath{\sqrt{\Delta{x}^2+\Delta{y}^2}}.
-It is accurate, but the multiplication and square root can slow down the
processing.
+Many of the programs use the multi-threaded character of modern CPUs, in
@ref{Multi-threaded operations} we will discuss how you can configure this
behavior, along with some tips on making best use of them.
+In @ref{Numeric data types}, we will review the various types to store numbers
in your datasets: setting the proper type for the usage context@footnote{For
example, if the values in your dataset can only be integers between 0 or 65000,
store them in a unsigned 16-bit type, not 64-bit floating point type (which is
the default in most systems).
+It takes four times less space and is much faster to process.} can greatly
improve the file size and also speed of reading, writing or processing them.
-@item --interpnumngb=INT
-The number of nearby non-blank neighbors to use for interpolation.
-@end table
+We will then look into the recognized table formats in @ref{Tables} and how
large datasets are broken into tiles, or mesh grid in @ref{Tessellation}.
+Finally, we will take a look at the behavior regarding output files:
@ref{Automatic output} describes how the programs set a default name for their
output when you do not give one explicitly (using @option{--output}).
+When the output is a FITS file, all the programs also store some very useful
information in the header that is discussed in @ref{Output FITS files}.
-@node Operating mode options, , Processing options, Common options
-@subsubsection Operating mode options
+@menu
+* Command-line:: How to use the command-line.
+* Configuration files:: Values for unspecified variables.
+* Getting help:: Getting more information on the go.
+* Multi-threaded operations:: How threads are managed in Gnuastro.
+* Numeric data types:: Different types and how to specify them.
+* Memory management:: How memory is allocated (in RAM or HDD/SSD).
+* Tables:: Recognized table formats.
+* Tessellation:: Tile the dataset into non-overlapping bins.
+* Automatic output:: About automatic output names.
+* Output FITS files:: Common properties when outputs are FITS.
+* Numeric locale:: Decimal point printed like 0.5 instead of 0,5.
+@end menu
-Another group of options that are common to all the programs in Gnuastro are
those to do with the general operation of the programs.
-The explanation for those that are not only limited to Gnuastro but are common
to all GNU programs start with (GNU option).
+@node Command-line, Configuration files, Common program behavior, Common
program behavior
+@section Command-line
-@vtable @option
+Gnuastro's programs are customized through the standard Unix-like command-line
environment and GNU style command-line options.
+Both are very common in many Unix-like operating system programs.
+In @ref{Arguments and options} we will start with the difference between
arguments and options and elaborate on the GNU style of options.
+Afterwards, in @ref{Common options}, we will go into the detailed list of all
the options that are common to all the programs in Gnuastro.
-@item --
-(GNU option) Stop parsing the command-line.
-This option can be useful in scripts or when using the shell history.
-Suppose you have a long list of options, and want to see if removing some of
them (to read from configuration files, see @ref{Configuration files}) can give
a better result.
-If the ones you want to remove are the last ones on the command-line, you do
not have to delete them, you can just add @option{--} before them and if you do
not get what you want, you can remove the @option{--} and get the same initial
result.
+@menu
+* Arguments and options:: Different ways to specify inputs and
configuration.
+* Common options:: Options that are shared between all programs.
+* Shell TAB completion:: Customized TAB completion in Gnuastro.
+* Standard input:: Using output of another program as input.
+* Shell tips:: Useful tips and tricks for program usage.
+@end menu
-@item --usage
-(GNU option) Only print the options and arguments and abort.
-This is very useful for when you know the what the options do, and have just
forgot their long/short identifiers, see @ref{--usage}.
+@node Arguments and options, Common options, Command-line, Command-line
+@subsection Arguments and options
-@item -?
-@itemx --help
-(GNU option) Print all options with an explanation and abort.
-Adding this option will print all the options in their short and long formats,
also displaying which ones need a value if they are called (with an @option{=}
after the long format followed by a string specifying the format, see
@ref{Options}).
-A short explanation is also given for what the option is for.
-The program will quit immediately after the message is printed and will not do
any form of processing, see @ref{--help}.
+@cindex Shell
+@cindex Options to programs
+@cindex Command-line options
+@cindex Arguments to programs
+@cindex Command-line arguments
+When you type a command on the command-line, it is passed onto the shell (a
generic name for the program that manages the command-line) as a string of
characters.
+As an example, see the ``Invoking ProgramName'' sections in this manual for
some examples of commands with each program, like @ref{Invoking asttable},
@ref{Invoking astfits}, or @ref{Invoking aststatistics}.
-@item -V
-@itemx --version
-(GNU option) Print a short message, showing the full name, version, copyright
information and program authors and abort.
-On the first line, it will print the official name (not executable name) and
version number of the program.
-Following this is a blank line and a copyright information.
-The program will not run.
+The shell then brakes up your string into separate @emph{tokens} or
@emph{words} using any @emph{metacharacters} (like white-space, tab,
@command{|}, @command{>} or @command{;}) that are in the string.
+On the command-line, the first thing you usually enter is the name of the
program you want to run.
+After that, you can specify two types of tokens: @emph{arguments} and
@emph{options}.
+In the GNU-style, arguments are those tokens that are not preceded by any
hyphens (@command{-}, see @ref{Arguments}).
+Here is one example:
-@item -q
-@itemx --quiet
-Do Not report steps.
-All the programs in Gnuastro that have multiple major steps will report their
steps for you to follow while they are operating.
-If you do not want to see these reports, you can call this option and only
error/warning messages will be printed.
-If the steps are done very fast (depending on the properties of your input)
disabling these reports will also decrease running time.
+@example
+$ astcrop --center=53.162551,-27.789676 -w10/3600 --mode=wcs udf.fits
+@end example
-@item --cite
-Print all necessary information to cite and acknowledge Gnuastro in your
published papers.
-With this option, the programs will print the Bib@TeX{} entry to include in
your paper for Gnuastro in general, and the particular program's paper (if that
program comes with a separate paper).
-It will also print the necessary acknowledgment statement to add in the
respective section of your paper and it will abort.
-For a more complete explanation, please see @ref{Acknowledgments}.
+In the example above, we are running @ref{Crop} to crop a region of width 10
arc-seconds centered at the given RA and Dec from the input Hubble Ultra-Deep
Field (UDF) FITS image.
+Here, the argument is @file{udf.fits}.
+Arguments are most commonly the input file names containing your data.
+Options start with one or two hyphens, followed by an identifier for the
option (the option's name, for example, @option{--center}, @option{-w},
@option{--mode} in the example above) and its value (anything after the option
name, or the optional @key{=} character).
+Through options you can configure how the program runs (interprets the data
you provided).
-Citations and acknowledgments are vital for the continued work on Gnuastro.
-Gnuastro started, and is continued, based on separate research projects.
-So if you find any of the tools offered in Gnuastro to be useful in your
research, please use the output of this command to cite and acknowledge the
program (and Gnuastro) in your research paper.
-Thank you.
+@vindex --help
+@vindex --usage
+@cindex Mandatory arguments
+Arguments can be mandatory and optional and unlike options, they do not have
any identifiers.
+Hence, when there multiple arguments, their order might also matter (for
example, in @command{cp} which is used for copying one file to another
location).
+The outputs of @option{--usage} and @option{--help} shows which arguments are
optional and which are mandatory, see @ref{--usage}.
-Gnuastro is still new, there is no separate paper only devoted to Gnuastro yet.
-Therefore currently the paper to cite for Gnuastro is the paper for
NoiseChisel which is the first published paper introducing Gnuastro to the
astronomical community.
-Upon reaching a certain point, a paper completely devoted to describing
Gnuastro's many functionalities will be published, see @ref{GNU Astronomy
Utilities 1.0}.
+As their name suggests, @emph{options} can be considered to be optional and
most of the time, you do not have to worry about what order you specify them in.
+When the order does matter, or the option can be invoked multiple times, it is
explicitly mentioned in the ``Invoking ProgramName'' section of each program
(this is a very important aspect of an option).
-@item -P
-@itemx --printparams
-With this option, Gnuastro's programs will read your command-line options and
all the configuration files.
-If there is no problem (like a missing parameter or a value in the wrong
format or range) and immediately before actually running, the programs will
print the full list of option names, values and descriptions, sorted and
grouped by context and abort.
-They will also report the version number, the date they were configured on
your system and the time they were reported.
+@cindex Metacharacters on the command-line In case your arguments or option
values contain any of the shell's meta-characters, you have to quote them.
+If there is only one such character, you can use a backslash (@command{\})
before it.
+If there are multiple, it might be easier to simply put your whole argument or
option value inside of double quotes (@command{"}).
+In such cases, everything inside the double quotes will be seen as one token
or word.
-As an example, you can give your full command-line options and even the input
and output file names and finally just add @option{-P} to check if all the
parameters are finely set.
-If everything is OK, you can just run the same command (easily retrieved from
the shell history, with the top arrow key) and simply remove the last two
characters that showed this option.
+@cindex HDU
+@cindex Header data unit
+For example, let's say you want to specify the header data unit (HDU) of your
FITS file using a complex expression like `@command{3; images(exposure > 100)}'.
+If you simply add these after the @option{--hdu} (@option{-h}) option, the
programs in Gnuastro will read the value to the HDU option as `@command{3}' and
run.
+Then, the shell will attempt to run a separate command
`@command{images(exposure > 100)}' and complain about a syntax error.
+This is because the semicolon (@command{;}) is an `end of command' character
in the shell.
+To solve this problem you can simply put double quotes around the whole string
you want to pass to @option{--hdu} as seen below:
+@example
+$ astcrop --hdu="3; images(exposure > 100)" image.fits
+@end example
-No program will actually start its processing when this option is called.
-The otherwise mandatory arguments for each program (for example, input image
or catalog files) are no longer required when you call this option.
-@item --config=STR
-Parse @option{STR} as a configuration file name, immediately when this option
is confronted (see @ref{Configuration files}).
-The @option{--config} option can be called multiple times in one run of any
Gnuastro program on the command-line or in the configuration files.
-In any case, it will be immediately read (before parsing the rest of the
options on the command-line, or lines in a configuration file).
-If the given file does not exist or cannot be read for any reason, the program
will print a warning and continue its processing.
-The warning can be suppressed with @option{--quiet}.
-Note that by definition, options on the command-line still take precedence
over those in any configuration file, including the file(s) given to this
option if they are called before it.
-Also see @option{--lastconfig} and @option{--onlyversion} on how this option
can be used for reproducible results.
-You can use @option{--checkconfig} (below) to check/confirm the parsing of
configuration files.
-@item --checkconfig
-Print options and their values, within the command-line or configuration
files, as they are parsed (see @ref{Configuration file precedence}).
-If an option has already been set, or is ignored by the program, this option
will also inform you with special values like @code{--ALREADY-SET--}.
-Only options that are parsed after this option are printed, so to see the
parsing of all input options, it is recommended to put this option immediately
after the program name before any other options.
-@cindex Debug
-This is a very good option to confirm where the value of each option is has
been defined in scenarios where there are multiple configuration files (for
debugging).
+@menu
+* Arguments:: For specifying the main input files/operations.
+* Options:: For configuring the behavior of the program.
+@end menu
-@item --config-prefix=STR
-Accept option names in configuration files that start with the given prefix.
-Since order matters when reading custom configuration files, this option
should be called @strong{before} the @option{--config} option(s) that contain
options with the given prefix.
-This option does not affect the options within configuration files that have
the standard name (without a prefix).
+@node Arguments, Options, Arguments and options, Arguments and options
+@subsubsection Arguments
+In Gnuastro, arguments are almost exclusively used as the input data file
names.
+Please consult the first few paragraph of the ``Invoking ProgramName'' section
for each program for a description of what it expects as input, how many
arguments, or input data, it accepts, or in what order.
+Everything particular about how a program treats arguments, is explained under
the ``Invoking ProgramName'' section for that program.
-This gives unique features to Gnuastro's configuration files, especially in
large pipelines.
-Let's demonstrate this with the simple scenario below.
-You have multiple configuration files for different instances of one program
(let's assume @file{nc-a.conf} and @file{nc-b.conf}).
-At the same time, want to load all the option names/values into your shell as
environment variables (for example with @code{source}).
-This happens when you want to use the options as shell variables in other
parts of the your pipeline.
+@cindex Filename suffix
+@cindex Suffix (filename)
+@cindex FITS filename suffixes
+Generally, if there is a standard file name suffix for a particular format,
that filename extension is checked to identify their format.
+In astronomy (and thus Gnuastro), FITS is the preferred format for inputs and
outputs, so the focus here and throughout this book is on FITS.
+However, other formats are also accepted in special cases, for example,
@ref{ConvertType} also accepts JPEG or TIFF inputs, and writes JPEG, EPS or PDF
files.
+The recognized suffixes for these formats are listed there.
-If the two configuration files have different values for the same option (as
shown below), and you don't use @code{--config-prefix}, the shell will
over-write the common option values between the configuration files.
-But thanks to @code{--config-prefix}, you can give a different prefix for the
different instances of the same option in different configuration files.
+The list below shows the recognized suffixes for FITS data files in Gnuastro's
programs.
+However, in some scenarios FITS writers may not append a suffix to the file,
or use a non-recognized suffix (not in the list below).
+Therefore if a FITS file is expected, but it does not have any of these
suffixes, Gnuastro programs will look into the contents of the file and if it
does conform with the FITS standard, the file will be used.
+Just note that checking about 5 characters at the end of a name string is much
more efficient than opening and checking the contents of a file, so it is
generally recommended to have a recognized FITS suffix.
-@example
-$ cat nc-a.conf
-a_tilesize=20,20
+@itemize
-$ cat nc-b.conf
-b_tilesize=40,40
+@item
+@file{.fits}: The standard file name ending of a FITS image.
-## Load configuration files as shell scripts (to define the
-## option name and values as shell variables with values).
-## Just note that 'source' only takes one file at a time.
-$ for c in nc-*.conf; do source $c; done
+@item
+@file{.fit}: Alternative (3 character) FITS suffix.
-$ astnoisechisel img.fits \
- --config=nc-a.conf --config-prefix=a_
-$ echo "NoiseChisel run with --tilesize=$a_tilesize"
+@item
+@file{.fits.Z}: A FITS image compressed with @command{compress}.
-$ astnoisechisel img.fits \
- --config=nc-b.conf --config-prefix=b_
-$ echo "NoiseChisel run with --tilesize=$b_tilesize"
-@end example
+@item
+@file{.fits.gz}: A FITS image compressed with GNU zip (gzip).
-@item -S
-@itemx --setdirconf
-Update the current directory configuration file for the Gnuastro program and
quit.
-The full set of command-line and configuration file options will be parsed and
options with a value will be written in the current directory configuration
file for this program (see @ref{Configuration files}).
-If the configuration file or its directory does not exist, it will be created.
-If a configuration file exists it will be replaced (after it, and all other
configuration files have been read).
-In any case, the program will not run.
+@item
+@file{.fits.fz}: A FITS image compressed with @command{fpack}.
-This is the recommended method@footnote{Alternatively, you can use your
favorite text editor.} to edit/set the configuration file for all future calls
to Gnuastro's programs.
-It will internally check if your values are in the correct range and type and
save them according to the configuration file format, see @ref{Configuration
file format}.
-So if there are unreasonable values to some options, the program will notify
you and abort before writing the final configuration file.
+@item
+@file{.imh}: IRAF format image file.
-When this option is called, the otherwise mandatory arguments, for
-example input image or catalog file(s), are no longer mandatory (since
-the program will not run).
+@end itemize
-@item -U
-@itemx --setusrconf
-Update the user configuration file and quit (see @ref{Configuration files}).
-See explanation under @option{--setdirconf} for more details.
+Through out this book and in the command-line outputs, whenever we want to
generalize all such astronomical data formats in a text place-holder, we will
use @file{ASTRdata}, we will assume that the extension is also part of this
name.
+Any file ending with these names is directly passed on to CFITSIO to read.
+Therefore you do not necessarily have to have these files on your computer,
they can also be located on an FTP or HTTP server too, see the CFITSIO manual
for more information.
-@item --lastconfig
-This is the last configuration file that must be read.
-When this option is confronted in any stage of reading the options (on the
command-line or in a configuration file), no other configuration file will be
parsed, see @ref{Configuration file precedence} and @ref{Current directory and
User wide}.
-Like all on/off options, on the command-line, this option does not take any
values.
-But in a configuration file, it takes the values of @option{0} or @option{1},
see @ref{Configuration file format}.
-If it is present in a configuration file with a value of @option{0}, then all
later occurrences of this option will be ignored.
+CFITSIO has its own error reporting techniques, if your input file(s) cannot
be opened, or read, those errors will be printed prior to the final error by
Gnuastro.
-@item --onlyversion=STR
-Only run the program if Gnuastro's version is exactly equal to @option{STR}
(see @ref{Version numbering}).
-Note that it is not compared as a number, but as a string of characters, so
@option{0}, or @option{0.0} and @option{0.00} are different.
-If the running Gnuastro version is different, then this option will report an
error and abort as soon as it is confronted on the command-line or in a
configuration file.
-If the running Gnuastro version is the same as @option{STR}, then the program
will run as if this option was not called.
-This is useful if you want your results to be exactly reproducible and not
mistakenly run with an updated/newer or older version of the program.
-Besides internal algorithmic/behavior changes in programs, the existence of
options or their names might change between versions (especially in these
earlier versions of Gnuastro).
+@node Options, , Arguments, Arguments and options
+@subsubsection Options
-Hence, when using this option (probably in a script or in a configuration
file), be sure to call it before other options.
-The benefit is that, when the version differs, the other options will not be
parsed and you, or your collaborators/users, will not get errors saying an
option in your configuration does not exist in the running version of the
program.
+@cindex GNU style options
+@cindex Options, GNU style
+@cindex Options, short (@option{-}) and long (@option{--})
+Command-line options allow configuring the behavior of a program in all
GNU/Linux applications for each particular execution on a particular input data.
+A single option can be called in two ways: @emph{long} or @emph{short}.
+All options in Gnuastro accept the long format which has two hyphens an can
have many characters (for example, @option{--hdu}).
+Short options only have one hyphen (@key{-}) followed by one character (for
example, @option{-h}).
+You can see some examples in the list of options in @ref{Common options} or
those for each program's ``Invoking ProgramName'' section.
+Both formats are shown for those which support both.
+First the short is shown then the long.
-Here is one example of how this option can be used in conjunction with the
@option{--lastconfig} option.
-Let's assume that you were satisfied with the results of this command:
@command{astnoisechisel image.fits --snquant=0.95} (along with various options
set in various configuration files).
-You can save the state of NoiseChisel and reproduce that exact result on
@file{image.fits} later by following these steps (the extra spaces, and
@key{\}, are only for easy readability, if you want to try it out, only one
space between each token is enough).
+Usually, the short options are for when you are writing on the command-line
and want to save keystrokes and time.
+The long options are good for shell scripts, where you are not usually rushing.
+Long options provide a level of documentation, since they are more descriptive
and less cryptic.
+Usually after a few months of not running a program, the short options will be
forgotten and reading your previously written script will not be easy.
-@example
-$ echo "onlyversion X.XX" > reproducible.conf
-$ echo "lastconfig 1" >> reproducible.conf
-$ astnoisechisel image.fits --snquant=0.95 -P \
- >> reproducible.conf
-@end example
+@cindex On/Off options
+@cindex Options, on/off
+Some options need to be given a value if they are called and some do not.
+You can think of the latter type of options as on/off options.
+These two types of options can be distinguished using the output of the
@option{--help} and @option{--usage} options, which are common to all GNU
software, see @ref{Getting help}.
+In Gnuastro we use the following strings to specify when the option needs a
value and what format that value should be in.
+More specific tests will be done in the program and if the values are out of
range (for example, negative when the program only wants a positive value), an
error will be reported.
-@option{--onlyversion} was available from Gnuastro 0.0, so putting it
immediately at the start of a configuration file will ensure that later, you
(or others using different version) will not get a non-recognized option error
in case an option was added/removed.
-@option{--lastconfig} will inform the installed NoiseChisel to not parse any
other configuration files.
-This is done because we do not want the user's user-wide or system wide option
values affecting our results.
-Finally, with the third command, which has a @option{-P} (short for
@option{--printparams}), NoiseChisel will print all the option values visible
to it (in all the configuration files) and the shell will append them to
@file{reproduce.conf}.
-Hence, you do not have to worry about remembering the (possibly) different
options in the different configuration files.
+@vtable @option
-Afterwards, if you run NoiseChisel as shown below (telling it to read this
configuration file with the @file{--config} option).
-You can be sure that there will either be an error (for version mismatch) or
it will produce exactly the same result that you got before.
+@item INT
+The value is read as an integer.
-@example
-$ astnoisechisel --config=reproducible.conf
-@end example
+@item FLT
+The value is read as a float.
+There are generally two types, depending on the context.
+If they are for fractions, they will have to be less than or equal to unity.
-@item --log
-Some programs can generate extra information about their outputs in a log file.
-When this option is called in those programs, the log file will also be
printed.
-If the program does not generate a log file, this option is ignored.
+@item STR
+The value is read as a string of characters.
+For example, column names in a table, or HDU names in a multi-extension FITS
file.
+Other examples include human-readable settings by some programs like the
@option{--domain} option of the Convolve program that can be either
@code{spatial} or @code{frequency} (to specify the type of convolution, see
@ref{Convolve}).
-@cartouche
-@noindent
-@strong{@option{--log} is not thread-safe}: The log file usually has a fixed
name.
-Therefore if two simultaneous calls (with @option{--log}) of a program are
made in the same directory, the program will try to write to he same file.
-This will cause problems like unreasonable log file, undefined behavior, or a
crash.
-@end cartouche
+@item FITS @r{or} FITS/TXT
+The value should be a file (most commonly FITS).
+In many cases, other formats may also be accepted (for example, input tables
can be FITS or plain-text, see @ref{Recognized table formats}).
-@cindex CPU threads, set number
-@cindex Number of CPU threads to use
-@item -N INT
-@itemx --numthreads=INT
-Use @option{INT} CPU threads when running a Gnuastro program (see
@ref{Multi-threaded operations}).
-If the value is zero (@code{0}), or this option is not given on the
command-line or any configuration file, the value will be determined at
run-time: the maximum number of threads available to the system when you run a
Gnuastro program.
+@end vtable
-Note that multi-threaded programming is only relevant to some programs.
-In others, this option will be ignored.
+@noindent
+@cindex Values to options
+@cindex Option values
+To specify a value in the short format, simply put the value after the option.
+Note that since the short options are only one character long, you do not have
to type anything between the option and its value.
+For the long option you either need white space or an @option{=} sign, for
example, @option{-h2}, @option{-h 2}, @option{--hdu 2} or @option{--hdu=2} are
all equivalent.
-@end vtable
+The short format of on/off options (those that do not need values) can be
concatenated for example, these two hypothetical sequences of options are
equivalent: @option{-a -b -c4} and @option{-abc4}.
+As an example, consider the following command to run Crop:
+@example
+$ astcrop -Dr3 --wwidth 3 catalog.txt --deccol=4 ASTRdata
+@end example
+@noindent
+The @command{$} is the shell prompt, @command{astcrop} is the program name.
+There are two arguments (@command{catalog.txt} and @command{ASTRdata}) and
four options, two of them given in short format (@option{-D}, @option{-r}) and
two in long format (@option{--width} and @option{--deccol}).
+Three of them require a value and one (@option{-D}) is an on/off option.
+@vindex --printparams
+@cindex Options, abbreviation
+@cindex Long option abbreviation
+If an abbreviation is unique between all the options of a program, the long
option names can be abbreviated.
+For example, instead of typing @option{--printparams}, typing @option{--print}
or maybe even @option{--pri} will be enough, if there are conflicts, the
program will warn you and show you the alternatives.
+Finally, if you want the argument parser to stop parsing arguments beyond a
certain point, you can use two dashes: @option{--}.
+No text on the command-line beyond these two dashes will be parsed.
+@cindex Repeated options
+@cindex Options, repeated
+Gnuastro has two types of options with values, those that only take a single
value are the most common type.
+If these options are repeated or called more than once on the command-line,
the value of the last time it was called will be assigned to it.
+This is very useful when you are testing/experimenting.
+Let's say you want to make a small modification to one option value.
+You can simply type the option with a new value in the end of the command and
see how the script works.
+If you are satisfied with the change, you can remove the original option for
human readability.
+If the change was not satisfactory, you can remove the one you just added and
not worry about forgetting the original value.
+Without this capability, you would have to memorize or save the original value
somewhere else, run the command and then change the value again which is not at
all convenient and is potentially cause lots of bugs.
+On the other hand, some options can be called multiple times in one run of a
program and can thus take multiple values (for example, see the
@option{--column} option in @ref{Invoking asttable}.
+In these cases, the order of stored values is the same order that you
specified on the command-line.
-@node Shell TAB completion, Standard input, Common options, Command-line
-@subsection Shell TAB completion (highly customized)
+@cindex Configuration files
+@cindex Default option values
+Gnuastro's programs do not keep any internal default values, so some options
are mandatory and if they do not have a value, the program will complain and
abort.
+Most programs have many such options and typing them by hand on every call is
impractical.
+To facilitate the user experience, after parsing the command-line, Gnuastro's
programs read special configuration files to get the necessary values for the
options you have not identified on the command-line.
+These configuration files are fully described in @ref{Configuration files}.
@cartouche
@noindent
-@strong{Under development:} Gnuastro's TAB completion in Bash already greatly
improves usage of Gnuastro on the command-line, but still under development and
not yet complete.
-If you are interested to try it out, please go ahead and activate it (as
described below), we encourage this.
-But please have in mind that there are known
issues@footnote{@url{http://savannah.gnu.org/bugs/index.php?group=gnuastro&category_id=128}}
and you may find new issues.
-If you do, please get in touch with us as described in @ref{Report a bug}.
-TAB completion is currently only implemented in the following programs:
Arithmetic, BuildProgram, ConvertType, Convolve, CosmicCalculator, Crop, Fits
and Table.
-For progress on this task, please see Task
15799@footnote{@url{https://savannah.gnu.org/task/?15799}}.
+@cindex Tilde expansion as option values
+@strong{CAUTION:} In specifying a file address, if you want to use the shell's
tilde expansion (@command{~}) to specify your home directory, leave at least
one space between the option name and your value.
+For example, use @command{-o ~/test}, @command{--output ~/test} or
@command{--output= ~/test}.
+Calling them with @command{-o~/test} or @command{--output=~/test} will disable
shell expansion.
@end cartouche
-
-@cindex Bash auto-complete
-@cindex Completion in the shell
-@cindex Bash programmable completion
-@cindex Autocomplete (in the shell/Bash)
-Bash provides a built-in feature called @emph{programmable
completion}@footnote{@url{https://www.gnu.org/software/bash/manual/html_node/Programmable-Completion.html}}
to help increase interactive workflow efficiency and minimize the number of
key-strokes @emph{and} the need to memorize things.
-It is also known as TAB completion, bash completion, auto-completion, or word
completion.
-Completion is activated by pressing @key{[TAB]} while you are typing a command.
-For file arguments this is the default behavior already and you have probably
used it a lot with any command-line program.
-
-Besides this simple/default mode, Bash also enables a high level of
customization features for its completion.
-These features have been extensively used in Gnuastro to improve your work
efficiency@footnote{To learn how Gnuastro implements TAB completion in Bash,
see @ref{Bash programmable completion}.}.
-For example, if you are running @code{asttable} (which only accepts files
containing a table), and you press @key{[TAB]}, it will only suggest files
containing tables.
-As another example, if an option needs image HDUs within a FITS file, pressing
@key{[TAB]} will only suggest the image HDUs (and not other possibly existing
HDUs that contain tables, or just metadata).
-Just note that the file name has to be already given on the command-line
before reaching such options (that look into the contents of a file).
-
-But TAB completion is not limited to file types or contents.
-Arguments/Options that take certain fixed string values will directly suggest
those strings with TAB, and completely ignore the file structure (for example,
spectral line names in @ref{Invoking astcosmiccal})!
-As another example, the option @option{--numthreads} option (to specify the
number of threads to use by the program), will find the number of available
threads on the system, and suggest the possible numbers with a TAB!
-
-To activate Gnuastro's custom TAB completion in Bash, you need to put the
following line in one of your Bash startup files (for example,
@file{~/.bashrc}).
-If you installed Gnuastro using the steps of @ref{Quick start}, you should
have already done this (the command just after @command{sudo make install}).
-For a list of (and discussion on) Bash startup files and installation
directories see @ref{Installation directory}.
-Of course, if Gnuastro was installed in a custom location, replace the
`@file{/usr/local}' part of the line below to the value that was given to
@option{--prefix} during Gnuastro's configuration@footnote{In case you do not
know the installation directory of Gnuastro on your system, you can find out
with this command: @code{which astfits | sed -e"s|/bin/astfits||"}}.
-
-@example
-# Enable Gnuastro's TAB completion
-source /usr/local/share/gnuastro/completion.bash
-@end example
-
-After adding the line above in a Bash startup file, TAB completion will always
be activated in any new terminal.
-To see if it has been activated, try it out with @command{asttable [TAB][TAB]}
and @command{astarithmetic [TAB][TAB]} in a directory that contains tables and
images.
-The first will only suggest the files with a table, and the second, only those
with an image.
-
@cartouche
@noindent
-@strong{TAB completion only works with long option names:}
-As described above, short options are much more complex to generalize,
therefore TAB completion is only available for long options.
-But do not worry!
-TAB completion also involves option names, so if you just type
@option{--a[TAB][TAB]}, you will get the list of options that start with an
@option{--a}.
-Therefore as a side-effect of TAB completion, your commands will be far more
human-readable with minimal key strokes.
+@strong{CAUTION:} If you forget to specify a value for an option which
requires one, and that option is the last one, Gnuastro will warn you.
+But if it is in the middle of the command, it will take the text of the next
option or argument as the value which can cause undefined behavior.
+@end cartouche
+@cartouche
+@noindent
+@cindex Counting from zero.
+@strong{NOTE:} In some contexts Gnuastro's counting starts from 0 and in
others 1.
+You can assume by default that counting starts from 1, if it starts from 0 for
a special option, it will be explicitly mentioned.
@end cartouche
+@node Common options, Shell TAB completion, Arguments and options, Command-line
+@subsection Common options
-@node Standard input, Shell tips, Shell TAB completion, Command-line
-@subsection Standard input
-
-@cindex Standard input
-@cindex Stream: standard input
-The most common way to feed the primary/first input dataset into a program is
to give its filename as an argument (discussed in @ref{Arguments}).
-When you want to run a series of programs in sequence, this means that each
will have to keep the output of each program in a separate file and re-type
that file's name in the next command.
-This can be very slow and frustrating (mis-typing a file's name).
-
-@cindex Standard output stream
-@cindex Stream: standard output
-To solve the problem, the founders of Unix defined pipes to directly feed the
output of one program (its ``Standard output'' stream) into the ``standard
input'' of a next program.
-This removes the need to make temporary files between separate processes and
became one of the best demonstrations of the Unix-way, or Unix philosophy.
+@cindex Options common to all programs
+@cindex Gnuastro common options
+To facilitate the job of the users and developers, all the programs in
Gnuastro share some basic command-line options for the options that are common
to many of the programs.
+The full list is classified as @ref{Input output options}, @ref{Processing
options}, and @ref{Operating mode options}.
+In some programs, some of the options are irrelevant, but still recognized
(you will not get an unrecognized option error, but the value is not used).
+Unless otherwise mentioned, these options are identical between all programs.
-Every program has three streams identifying where it reads/writes non-file
inputs/outputs: @emph{Standard input}, @emph{Standard output}, and
@emph{Standard error}.
-When a program is called alone, all three are directed to the terminal that
you are using.
-If it needs an input, it will prompt you for one and you can type it in.
-Or, it prints its results in the terminal for you to see.
+@menu
+* Input output options:: Common input/output options.
+* Processing options:: Options for common processing steps.
+* Operating mode options:: Common operating mode options.
+@end menu
-For example, say you have a FITS table/catalog containing the B and V band
magnitudes (@code{MAG_B} and @code{MAG_V} columns) of a selection of galaxies
along with many other columns.
-If you want to see only these two columns in your terminal, can use Gnuastro's
@ref{Table} program like below:
+@node Input output options, Processing options, Common options, Common options
+@subsubsection Input/Output options
-@example
-$ asttable cat.fits -cMAG_B,MAG_V
-@end example
+These options are to do with the input and outputs of the various
+programs.
-Through the Unix pipe mechanism, when the shell confronts the pipe character
(@key{|}), it connects the standard output of the program before the pipe, to
the standard input of the program after it.
-So it is literally a ``pipe'': everything that you would see printed by the
first program on the command (without any pipe), is now passed to the second
program (and not seen by you).
+@vtable @option
-@cindex AWK
-@cindex GNU AWK
-To continue the previous example, let's say you want to see the B-V color.
-To do this, you can pipe Table's output to AWK (a wonderful tool for
processing things like plain text tables):
+@cindex Timeout
+@cindex Standard input
+@item --stdintimeout
+Number of micro-seconds to wait for writing/typing in the @emph{first line} of
standard input from the command-line (see @ref{Standard input}).
+This is only relevant for programs that also accept input from the standard
input, @emph{and} you want to manually write/type the contents on the terminal.
+When the standard input is already connected to a pipe (output of another
program), there will not be any waiting (hence no timeout, thus making this
option redundant).
-@example
-$ asttable cat.fits -cMAG_B,MAG_V | awk '@{print $1-$2@}'
-@end example
+If the first line-break (for example, with the @key{ENTER} key) is not
provided before the timeout, the program will abort with an error that no input
was given.
+Note that this time interval is @emph{only} for the first line that you type.
+Once the first line is given, the program will assume that more data will come
and accept rest of your inputs without any time limit.
+You need to specify the ending of the standard input, for example, by pressing
@key{CTRL-D} after a new line.
-But understanding the distribution by visually seeing all the numbers under
each other is not too useful! You can therefore feed this single column
information into @ref{Statistics} to give you a general feeling of the
distribution with the same command:
+Note that any input you write/type into a program on the command-line with
Standard input will be discarded (lost) once the program is finished.
+It is only recoverable manually from your command-line (where you actually
typed) as long as the terminal is open.
+So only use this feature when you are sure that you do not need the dataset
(or have a copy of it somewhere else).
-@example
-$ asttable cat.fits -cMAG_B,MAG_V | awk '@{print $1-$2@}' | aststatistics
-@end example
-Gnuastro's programs that accept input from standard input, only look into the
Standard input stream if there is no first argument.
-In other words, arguments take precedence over Standard input.
-When no argument is provided, the programs check if the standard input stream
is already full or not (output from another program is waiting to be used).
-If data is present in the standard input stream, it is used.
+@cindex HDU
+@cindex Header data unit
+@item -h STR/INT
+@itemx --hdu=STR/INT
+The name or number of the desired Header Data Unit, or HDU, in the FITS image.
+A FITS file can store multiple HDUs or extensions, each with either an image
or a table or nothing at all (only a header).
+Note that counting of the extensions starts from 0(zero), not 1(one).
+Counting from 0 is forced on us by CFITSIO which directly reads the value you
give with this option (see @ref{CFITSIO}).
+When specifying the name, case is not important so @command{IMAGE},
@command{image} or @command{ImAgE} are equivalent.
-When the standard input is empty, the program will wait
@option{--stdintimeout} micro-seconds for you to manually enter the first line
(ending with a new-line character, or the @key{ENTER} key, see @ref{Input
output options}).
-If it detects the first line in this time, there is no more time limit, and
you can manually write/type all the lines for as long as it takes.
-To inform the program that Standard input has finished, press @key{CTRL-D}
after a new line.
-If the program does not catch the first line before the time-out finishes, it
will abort with an error saying that no input was provided.
+CFITSIO has many capabilities to help you find the extension you want, far
beyond the simple extension number and name.
+See CFITSIO manual's ``HDU Location Specification'' section for a very
complete explanation with several examples.
+A @code{#} is appended to the string you specify for the HDU@footnote{With the
@code{#} character, CFITSIO will only read the desired HDU into your memory,
not all the existing HDUs in the fits file.} and the result is put in square
brackets and appended to the FITS file name before calling CFITSIO to read the
contents of the HDU for all the programs in Gnuastro.
@cartouche
@noindent
-@strong{Manual input in Standard input is discarded:}
-Be careful that when you manually fill the Standard input, the data will be
discarded once the program finishes and reproducing the result will be
impossible.
-Therefore this form of providing input is only good for temporary tests.
+@strong{Default HDU is HDU number 1 (counting from 0):} by default, Gnuastro’s
programs assume that their (main/first) input is in HDU number 1 (counting from
zero).
+So if you don’t specify the HDU number, the program will read the input from
this HDU.
+For programs that can take multiple FITS datasets as input (like
@ref{Arithmetic}) this default HDU applies to the first input, you still need
to call @option{--hdu} for the other inputs.
+Generally, all Gnuastro's programs write their outputs in HDU number 1 (HDU 0
is reserved for metadata like the configuration parameters that the program was
run with).
+For more on this, see @ref{Fits}.
@end cartouche
-@cartouche
-@noindent
-@strong{Standard input currently only for plain text:}
-Currently Standard input only works for plain text inputs like the example
above.
-We will later allow FITS files into the programs through standard input also.
-@end cartouche
+@item -s STR
+@itemx --searchin=STR
+Where to match/search for columns when the column identifier was not a number,
see @ref{Selecting table columns}.
+The acceptable values are @command{name}, @command{unit}, or @command{comment}.
+This option is only relevant for programs that take table columns as input.
+@item -I
+@itemx --ignorecase
+Ignore case while matching/searching column meta-data (in the field specified
by the @option{--searchin}).
+The FITS standard suggests to treat the column names as case insensitive,
which is strongly recommended here also but is not enforced.
+This option is only relevant for programs that take table columns as input.
-@node Shell tips, , Standard input, Command-line
-@subsection Shell tips
+This option is not relevant to @ref{BuildProgram}, hence in that program the
short option @option{-I} is used for include directories, not to ignore case.
-Gnuastro's programs are primarily meant to be run on the command-line shell
environment.
-In this section, we will review some useful tips and tricks that can be
helpful in the pipelines that you run.
+@item -o STR
+@itemx --output=STR
+The name of the output file or directory. With this option the automatic
output names explained in @ref{Automatic output} are ignored.
-@menu
-* Separate shell variables for multiple outputs:: When you get values from
one command.
-@end menu
+@item -T STR
+@itemx --type=STR
+The data type of the output depending on the program context.
+This option is not applicable to some programs like @ref{Fits} and will be
ignored by them.
+The different acceptable values to this option are fully described in
@ref{Numeric data types}.
-@node Separate shell variables for multiple outputs, , Shell tips, Shell tips
-@subsubsection Separate shell variables for multiple outputs
+@item -D
+@itemx --dontdelete
+By default, if the output file already exists, Gnuastro's programs will
silently delete it and put their own outputs in its place.
+When this option is activated, if the output file already exists, the programs
will not delete it, will warn you, and will abort.
-Sometimes your commands print multiple values and you want to use them as
different shell variables.
-Let's describe the problem (shown in the box below) with an example (that you
can reproduce without any external data).
+@item -K
+@itemx --keepinputdir
+In automatic output names, do not remove the directory information of the
input file names.
+As explained in @ref{Automatic output}, if no output name is specified (with
@option{--output}), then the output name will be made in the existing directory
based on your input's file name (ignoring the directory of the input).
+If you call this option, the directory information of the input will be kept
and the automatically generated output name will be in the same directory as
the input (usually with a suffix added).
+Note that his is only relevant if you are running the program in a different
directory than the input data.
-With the commands below, we'll first make a noisy (@mymath{\sigma=5}) image
(@mymath{100\times100} pixels) using @ref{Arithmetic}.
-Then, we'll measure@footnote{The actual printed values by
@command{aststatistics} may slightly differ for you.
-This is because of a different random number generator seed used in
@command{astarithmetic}.
-To get an exactly reproducible result, see @ref{Generating random numbers}}
its mean and standard deviation using @ref{Statistics}.
+@item -t STR
+@itemx --tableformat=STR
+The output table's type.
+This option is only relevant when the output is a table and its format cannot
be deduced from its filename.
+For example, if a name ending in @file{.fits} was given to @option{--output},
then the program knows you want a FITS table.
+But there are two types of FITS tables: FITS ASCII, and FITS binary.
+Thus, with this option, the program is able to identify which type you want.
+The currently recognized values to this option are:
-@example
-$ astarithmetic 100 100 2 makenew 5 mknoise-sigma -oimg.fits
+@item --wcslinearmatrix=STR
+Select the linear transformation matrix of the output's WCS.
+This option only takes two values: @code{pc} (for the @code{PCi_j} formalism)
and @code{cd} (for @code{CDi_j}).
+For more on the different formalisms, please see Section 8.1 of the FITS
standard@footnote{@url{https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf}},
version 4.0.
-$ aststatistics img.fits --mean --std
--3.10938611484039e-03 4.99607077069093e+00
-@end example
+@cindex @code{CDELT}
+In short, in the @code{PCi_j} formalism, we only keep the linear rotation
matrix in these keywords and put the scaling factor (or the pixel scale in
astronomical imaging) in the @code{CDELTi} keywords.
+In the @code{CDi_j} formalism, we blend the scaling into the rotation into a
single matrix and keep that matrix in these FITS keywords.
+By default, Gnuastro uses the @code{PCi_j} formalism, because it greatly helps
in human readability of the raw keywords and is also the default mode of WCSLIB.
+However, in some circumstances it may be necessary to have the keywords in the
CD format; for example, when you need to feed the outputs into other software
that do not follow the full FITS standard and only recognize the @code{CDi_j}
formalism.
-@cartouche
-@noindent
-@strong{THE PROBLEM:} you want the first number printed above to be stored in
a shell variable called @code{my_mean} and the second number to be stored as
the @code{my_std} shell variable (you are free to choose any name!).
-@end cartouche
+@table @command
+@item txt
+A plain text table with white-space characters between the columns (see
+@ref{Gnuastro text table format}).
+@item fits-ascii
+A FITS ASCII table (see @ref{Recognized table formats}).
+@item fits-binary
+A FITS binary table (see @ref{Recognized table formats}).
+@end table
-@noindent
-The first thing that may come to mind is to run Statistics two times, and
write the output into separate variables like below:
+@end vtable
-@example
-$ my_std=$(aststatistics img.fits --std) ## NOT SOLUTION! ##
-$ my_mean=$(aststatistics img.fits --mean) ## NOT SOLUTION! ##
-@end example
-@cindex Global warming
-@cindex Carbon footprint
-But this is not a good solution because as @file{img.fits} becomes larger
(more pixels), the time it takes for Statistics to simply load the data into
memory can be significant.
-This will slow down your pipeline and besides wasting your time, it
contributes to global warming (by spending energy on an un-necessary action;
take this seriously because your pipeline may scale up to involve thousands of
large datasets)!
-Furthermore, besides loading of the input data, Statistics (and Gnuastro in
general) is designed to do multiple measurements in one pass over the data as
much as possible (to further decrease Gnuastro's carbon footprint).
-So when given @option{--mean --std}, it will measure both in one pass over the
pixels (not two passes!).
-In other words, in this case, you get the two measurements for the cost of one.
+@node Processing options, Operating mode options, Input output options, Common
options
+@subsubsection Processing options
-How do you separate the values from the first @command{aststatistics} command
above?
-One ugly way is to write the two-number output string into a single shell
variable and then separate, or tokenize, the string with two subsequent
commands like below:
+Some processing steps are common to several programs, so they are defined as
common options to all programs.
+Note that this class of common options is thus necessarily less common between
all the programs than those described in @ref{Input output options}, or
@ref{Operating mode options} options.
+Also, if they are irrelevant for a program, these options will not display in
the @option{--help} output of the program.
-@c Note that the comments aren't aligned in the Texinfo source because of
-@c the '@' characters before the braces of AWK. In the output, they are
-@c aligned.
-@example
-$ meanstd=$(aststatistics img.fits --mean --std) ## NOT SOLUTION! ##
-$ my_mean=$(echo $meanstd | awk '@{print $1@}') ## NOT SOLUTION! ##
-$ my_std=$(echo $meanstd | awk '@{print $2@}') ## NOT SOLUTION! ##
-@end example
+@table @option
+
+@item --minmapsize=INT
+The minimum size (in bytes) to memory-map a processing/internal array as a
file (on the non-volatile HDD/SSD), and not use the system's RAM.
+Before using this option, please read @ref{Memory management}.
+By default processing arrays will only be memory-mapped to a file when the RAM
is full.
+With this option, you can force the memory-mapping, even when there is enough
RAM.
+To ensure this default behavior, the pre-defined value to this option is an
extremely large value (larger than any existing RAM).
+
+Please note that using a non-volatile file (in the HDD/SDD) instead of RAM can
significantly increase the program's running time, especially on HDDs (where
read/write is slower).
+Also, note that the number of memory-mapped files that your kernel can support
is limited.
+So when this option is necessary, it is best to give it values larger than 1
megabyte (@option{--minmapsize=1000000}).
+You can then decrease it for a specific program's invocation on a large input
after you see memory issues arise (for example, an error, or the program not
aborting and fully consuming your memory).
+If you see randomly named files remaining in this directory when the program
finishes normally, please send us a bug report so we address the problem, see
@ref{Report a bug}.
@cartouche
@noindent
-@cindex Evaluate string as command (@command{eval})
-@cindex @command{eval} to evaluate string as command
-@strong{SOLUTION:} The solution is to formatted-print (@command{printf}) the
numbers as shell variables definitions in a string, and evaluate
(@command{eval}) that string as a command:
-
-@example
-$ eval "$(aststatistics img.fits --mean --std \
- | xargs printf "my_mean=%s; my_std=%s")"
-@end example
+@strong{Limited number of memory-mapped files:} The operating system kernels
usually support a limited number of memory-mapped files.
+Therefore never set @code{--minmapsize} to zero or a small number of bytes (so
too many files are created).
+If the kernel capacity is exceeded, the program will crash.
@end cartouche
-@noindent
-Let's review the solution (in more detail):
+@item --quietmmap
+Do Not print any message when an array is stored in non-volatile memory
+(HDD/SSD) and not RAM, see the description of @option{--minmapsize} (above)
+for more.
-@enumerate
-@item
-@cindex Standard input
-@cindex @command{xargs} (extended arguments)
-We pipe the output into @command{xargs}@footnote{For more on @command{xargs},
see @url{https://en.wikipedia.org/wiki/Xargs}.
-It will take the standard input (from the pipe in this scenario) and put it as
arguments of the next program (@command{printf} in this scenario).
-In other words, it is good for programs that don't take input from standard
input (@command{printf} in this case; but also includes others like
@command{cp}, @command{rm}, or @command{echo}).} (extended arguments) which
puts the two numbers it gets from the pipe, as arguments for @command{printf}
(formatted print; because @command{printf} doesn't take input from pipes).
-@item
-Within the @command{printf} call, we write the values after putting a variable
name and equal-sign, and in between them we put a @key{;} (as if it was a shell
command).
-The @code{%s} tells @command{printf} to print each input as a string (not to
interpret it as a number and loose precision).
-Here is the output of this phase:
+@item -Z INT[,INT[,...]]
+@itemx --tilesize=[,INT[,...]]
+The size of regular tiles for tessellation, see @ref{Tessellation}.
+For each dimension an integer length (in units of data-elements or pixels) is
necessary.
+If the number of input dimensions is different from the number of values given
to this option, the program will stop with an error.
+Values must be separated by commas (@key{,}) and can also be fractions (for
example, @code{4/2}).
+If they are fractions, the result must be an integer, otherwise an error will
be printed.
-@example
-$ aststatistics img.fits --mean --std \
- | xargs printf "my_mean=%s; my_std=%s"
-my_mean=-3.10938611484039e-03; my_std=4.99607077069093e+00
-@end example
+@item -M INT[,INT[,...]]
+@itemx --numchannels=INT[,INT[,...]]
+The number of channels for larger input tessellation, see @ref{Tessellation}.
+The number and types of acceptable values are similar to @option{--tilesize}.
+The only difference is that instead of length, the integers values given to
this option represent the @emph{number} of channels, not their size.
-@item
-But the output above is a string! To evaluate this string as a command, we
give it to the eval command like above.
-@end enumerate
+@item -F FLT
+@itemx --remainderfrac=FLT
+The fraction of remainder size along all dimensions to add to the first tile.
+See @ref{Tessellation} for a complete description.
+This option is only relevant if @option{--tilesize} is not exactly divisible
by the input dataset's size in a dimension.
+If the remainder size is larger than this fraction (compared to
@option{--tilesize}), then the remainder size will be added with one regular
tile size and divided between two tiles at the start and end of the given
dimension.
-@noindent
-After the solution above, you will have the two @code{my_mean} and
@code{my_std} variables to use separately in your pipeline:
+@item --workoverch
+Ignore the channel borders for the high-level job of the given application.
+As a result, while the channel borders are respected in defining the small
tiles (such that no tile will cross a channel border), the higher-level program
operation will ignore them, see @ref{Tessellation}.
-@example
-$ echo $my_mean
--3.10938611484039e-03
-$ echo $my_std
-4.99607077069093e+00
-@end example
+@item --checktiles
+Make a FITS file with the same dimensions as the input but each pixel is
replaced with the ID of the tile that it is associated with.
+Note that the tile IDs start from 0.
+See @ref{Tessellation} for more on Tiling an image in Gnuastro.
-@cindex Zsh shell
-@cindex Dash shell
-@cindex Portable script
-This @command{eval}-based solution has been tested in in GNU Bash, Dash and
Zsh and it works nicely in them (is ``portable'').
-This is because the constructs used here are pretty low-level (and widely
available).
+@item --oneelempertile
+When showing the tile values (for example, with @option{--checktiles}, or when
the program's output is tessellated) only use one element for each tile.
+This can be useful when only the relative values given to each tile compared
to the rest are important or need to be checked.
+Since the tiles usually have a large number of pixels within them the output
will be much smaller, and so easier to read, write, store, or send.
-For examples usages of this technique, see the following sections:
@ref{Extracting a single spectrum and plotting it} and @ref{Pseudo narrow-band
images}.
+Note that when the full input size in any dimension is not exactly divisible
by the given @option{--tilesize} in that dimension, the edge tile(s) will have
different sizes (in units of the input's size), see @option{--remainderfrac}.
+But with this option, all displayed values are going to have the (same) size
of one data-element.
+Hence, in such cases, the image proportions are going to be slightly different
with this option.
-@node Configuration files, Getting help, Command-line, Common program behavior
-@section Configuration files
+If your input image is not exactly divisible by the tile size and you want one
value per tile for some higher-level processing, all is not lost though.
+You can see how many pixels were within each tile (for example, to weight the
values or discard some for later processing) with Gnuastro's Statistics (see
@ref{Statistics}) as shown below.
+The output FITS file is going to have two extensions, one with the median
calculated on each tile and one with the number of elements that each tile
covers.
+You can then use the @code{where} operator in @ref{Arithmetic} to set the
values of all tiles that do not have the regular area to a blank value.
-@cindex @file{etc}
-@cindex Configuration files
-@cindex Necessary parameters
-@cindex Default option values
-@cindex File system Hierarchy Standard
-Each program needs a certain number of parameters to run.
-Supplying all the necessary parameters each time you run the program is very
frustrating and prone to errors.
-Therefore all the programs read the values for the necessary options you have
not given in the command-line from one of several plain text files (which you
can view and edit with any text editor).
-These files are known as configuration files and are usually kept in a
directory named @file{etc/} according to the file system hierarchy
-standard@footnote{@url{http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard}}.
+@example
+$ aststatistics --median --number --ontile input.fits \
+ --oneelempertile --output=o.fits
+$ REGULAR_AREA=1600 # Check second extension of `o.fits'.
+$ astarithmetic o.fits o.fits $REGULAR_AREA ne nan where \
+ -h1 -h2
+@end example
-@vindex --output
-@vindex --numthreads
-@cindex CPU threads, number
-@cindex Internal default value
-@cindex Number of CPU threads to use
-The thing to have in mind is that none of the programs in Gnuastro keep any
internal default value.
-All the values must either be stored in one of the configuration files or
explicitly called in the command-line.
-In case the necessary parameters are not given through any of these methods,
the program will print a missing option error and abort.
-The only exception to this is @option{--numthreads}, whose default value is
determined at run-time using the number of threads available to your system,
see @ref{Multi-threaded operations}.
-Of course, you can still provide a default value for the number of threads at
any of the levels below, but if you do not, the program will not abort.
-Also note that through automatic output name generation, the value to the
@option{--output} option is also not mandatory on the command-line or in the
configuration files for all programs which do not rely on that value as an
input@footnote{One example of a program which uses the value given to
@option{--output} as an input is ConvertType, this value specifies the type of
the output through the value to @option{--output}, see @ref{Invoking
astconvertt}.}, see @ref{Automatic output}.
+Note that if @file{input.fits} also has blank values, then the median on
+tiles with blank values will also be ignored with the command above (which
+is desirable).
+@item --inteponlyblank
+When values are to be interpolated, only change the values of the blank
+elements, keep the non-blank elements untouched.
-@menu
-* Configuration file format:: ASCII format of configuration file.
-* Configuration file precedence:: Precedence of configuration files.
-* Current directory and User wide:: Local and user configuration files.
-* System wide:: System wide configuration files.
-@end menu
+@item --interpmetric=STR
+@cindex Radial metric
+@cindex Taxicab metric
+@cindex Manhattan metric
+@cindex Metric: Manhattan, Taxicab, Radial
+The metric to use for finding nearest neighbors.
+Currently it only accepts the Manhattan (or taxicab) metric with
@code{manhattan}, or the radial metric with @code{radial}.
-@node Configuration file format, Configuration file precedence, Configuration
files, Configuration files
-@subsection Configuration file format
+The Manhattan distance between two points is defined with
@mymath{|\Delta{x}|+|\Delta{y}|}.
+Thus the Manhattan metric has the advantage of being fast, but at the expense
of being less accurate.
+The radial distance is the standard definition of distance in a Euclidean
space: @mymath{\sqrt{\Delta{x}^2+\Delta{y}^2}}.
+It is accurate, but the multiplication and square root can slow down the
processing.
-@cindex Configuration file suffix
-The configuration files for each program have the standard program executable
name with a `@file{.conf}' suffix.
-When you download the source code, you can find them in the same directory as
the source code of each program, see @ref{Program source}.
+@item --interpnumngb=INT
+The number of nearby non-blank neighbors to use for interpolation.
+@end table
-@cindex White space character
-@cindex Configuration file format
-Any line in the configuration file whose first non-white character is a
@key{#} is considered to be a comment and is ignored.
-An empty line is also similarly ignored.
-The long name of the option should be used as an identifier.
-The option name and option value should be separated by any number of
`white-space' characters (space, tab or vertical tab) or an equal (@key{=}).
-By default several space characters are used.
-If the value of an option has space characters (most commonly for the
@option{hdu} option), then the full value can be enclosed in double quotation
signs (@key{"}, similar to the example in @ref{Arguments and options}).
-If it is an option without a value in the @option{--help} output (on/off
option, see @ref{Options}), then the value should be @option{1} if it is to be
`on' and @option{0} otherwise.
+@node Operating mode options, , Processing options, Common options
+@subsubsection Operating mode options
-In each non-commented and non-blank line, any text after the first two words
(option identifier and value) is ignored.
-If an option identifier is not recognized in the configuration file, the name
of the file, the line number of the unrecognized option, and the unrecognized
identifier name will be reported and the program will abort.
-If a parameter is repeated more more than once in the configuration files,
accepts only one value, and is not set on the command-line, then only the first
value will be used, the rest will be ignored.
+Another group of options that are common to all the programs in Gnuastro are
those to do with the general operation of the programs.
+The explanation for those that are not only limited to Gnuastro but are common
to all GNU programs start with (GNU option).
-@cindex Writing configuration files
-@cindex Automatic configuration file writing
-@cindex Configuration files, writing
-You can build or edit any of the directories and the configuration files
yourself using any text editor.
-However, it is recommended to use the @option{--setdirconf} and
@option{--setusrconf} options to set default values for the current directory
or this user, see @ref{Operating mode options}.
-With these options, the values you give will be checked before writing in the
configuration file.
-They will also print a set of commented lines guiding the reader and will also
classify the options based on their context and write them in their logical
order to be more understandable.
+@vtable @option
+@item --
+(GNU option) Stop parsing the command-line.
+This option can be useful in scripts or when using the shell history.
+Suppose you have a long list of options, and want to see if removing some of
them (to read from configuration files, see @ref{Configuration files}) can give
a better result.
+If the ones you want to remove are the last ones on the command-line, you do
not have to delete them, you can just add @option{--} before them and if you do
not get what you want, you can remove the @option{--} and get the same initial
result.
-@node Configuration file precedence, Current directory and User wide,
Configuration file format, Configuration files
-@subsection Configuration file precedence
+@item --usage
+(GNU option) Only print the options and arguments and abort.
+This is very useful for when you know the what the options do, and have just
forgot their long/short identifiers, see @ref{--usage}.
-@cindex Configuration file precedence
-@cindex Configuration file directories
-@cindex Precedence, configuration files
-The option values in all the programs of Gnuastro will be filled in the
following order.
-If an option only takes one value which is given in an earlier step, any value
for that option in a later step will be ignored.
-Note that if the @option{lastconfig} option is specified in any step below, no
other configuration files will be parsed (see @ref{Operating mode options}).
+@item -?
+@itemx --help
+(GNU option) Print all options with an explanation and abort.
+Adding this option will print all the options in their short and long formats,
also displaying which ones need a value if they are called (with an @option{=}
after the long format followed by a string specifying the format, see
@ref{Options}).
+A short explanation is also given for what the option is for.
+The program will quit immediately after the message is printed and will not do
any form of processing, see @ref{--help}.
-@enumerate
-@item
-Command-line options, for a particular run of ProgramName.
+@item -V
+@itemx --version
+(GNU option) Print a short message, showing the full name, version, copyright
information and program authors and abort.
+On the first line, it will print the official name (not executable name) and
version number of the program.
+Following this is a blank line and a copyright information.
+The program will not run.
-@item
-@file{.gnuastro/astprogname.conf} is parsed by ProgramName in the current
directory.
+@item -q
+@itemx --quiet
+Do Not report steps.
+All the programs in Gnuastro that have multiple major steps will report their
steps for you to follow while they are operating.
+If you do not want to see these reports, you can call this option and only
error/warning messages will be printed.
+If the steps are done very fast (depending on the properties of your input)
disabling these reports will also decrease running time.
-@item
-@file{.gnuastro/gnuastro.conf} is parsed by all Gnuastro programs in the
current directory.
+@item --cite
+Print all necessary information to cite and acknowledge Gnuastro in your
published papers.
+With this option, the programs will print the Bib@TeX{} entry to include in
your paper for Gnuastro in general, and the particular program's paper (if that
program comes with a separate paper).
+It will also print the necessary acknowledgment statement to add in the
respective section of your paper and it will abort.
+For a more complete explanation, please see @ref{Acknowledgments}.
-@item
-@file{$HOME/.local/etc/astprogname.conf} is parsed by ProgramName in the
user's home directory (see @ref{Current directory and User wide}).
+Citations and acknowledgments are vital for the continued work on Gnuastro.
+Gnuastro started, and is continued, based on separate research projects.
+So if you find any of the tools offered in Gnuastro to be useful in your
research, please use the output of this command to cite and acknowledge the
program (and Gnuastro) in your research paper.
+Thank you.
-@item
-@file{$HOME/.local/etc/gnuastro.conf} is parsed by all Gnuastro programs in
the user's home directory (see @ref{Current directory and User wide}).
+Gnuastro is still new, there is no separate paper only devoted to Gnuastro yet.
+Therefore currently the paper to cite for Gnuastro is the paper for
NoiseChisel which is the first published paper introducing Gnuastro to the
astronomical community.
+Upon reaching a certain point, a paper completely devoted to describing
Gnuastro's many functionalities will be published, see @ref{GNU Astronomy
Utilities 1.0}.
-@item
-@file{prefix/etc/astprogname.conf} is parsed by ProgramName in the system-wide
installation directory (see @ref{System wide} for @file{prefix}).
+@item -P
+@itemx --printparams
+With this option, Gnuastro's programs will read your command-line options and
all the configuration files.
+If there is no problem (like a missing parameter or a value in the wrong
format or range) and immediately before actually running, the programs will
print the full list of option names, values and descriptions, sorted and
grouped by context and abort.
+They will also report the version number, the date they were configured on
your system and the time they were reported.
-@item
-@file{prefix/etc/gnuastro.conf} is parsed by all Gnuastro programs in the
system-wide installation directory (see @ref{System wide} for @file{prefix}).
+As an example, you can give your full command-line options and even the input
and output file names and finally just add @option{-P} to check if all the
parameters are finely set.
+If everything is OK, you can just run the same command (easily retrieved from
the shell history, with the top arrow key) and simply remove the last two
characters that showed this option.
-@end enumerate
+No program will actually start its processing when this option is called.
+The otherwise mandatory arguments for each program (for example, input image
or catalog files) are no longer required when you call this option.
-The basic idea behind setting this progressive state of checking for parameter
values is that separate users of a computer or separate folders in a user's
file system might need different values for some parameters.
+@item --config=STR
+Parse @option{STR} as a configuration file name, immediately when this option
is confronted (see @ref{Configuration files}).
+The @option{--config} option can be called multiple times in one run of any
Gnuastro program on the command-line or in the configuration files.
+In any case, it will be immediately read (before parsing the rest of the
options on the command-line, or lines in a configuration file).
+If the given file does not exist or cannot be read for any reason, the program
will print a warning and continue its processing.
+The warning can be suppressed with @option{--quiet}.
-@cartouche
-@noindent
-@strong{Checking the order:}
-You can confirm/check the order of parsing configuration files using the
@option{--checkconfig} option with any Gnuastro program, see @ref{Operating
mode options}.
-Just be sure to place this option immediately after the program name, before
any other option.
-@end cartouche
+Note that by definition, options on the command-line still take precedence
over those in any configuration file, including the file(s) given to this
option if they are called before it.
+Also see @option{--lastconfig} and @option{--onlyversion} on how this option
can be used for reproducible results.
+You can use @option{--checkconfig} (below) to check/confirm the parsing of
configuration files.
-As you see above, there can also be a configuration file containing the common
options in all the programs: @file{gnuastro.conf} (see @ref{Common options}).
-If options specific to one program are specified in this file, there will be
unrecognized option errors, or unexpected behavior if the option has different
behavior in another program.
-On the other hand, there is no problem with @file{astprogname.conf} containing
common options@footnote{As an example, the @option{--setdirconf} and
@option{--setusrconf} options will also write the common options they have read
in their produced @file{astprogname.conf}.}.
+@item --checkconfig
+Print options and their values, within the command-line or configuration
files, as they are parsed (see @ref{Configuration file precedence}).
+If an option has already been set, or is ignored by the program, this option
will also inform you with special values like @code{--ALREADY-SET--}.
+Only options that are parsed after this option are printed, so to see the
parsing of all input options, it is recommended to put this option immediately
after the program name before any other options.
-@cartouche
-@noindent
-@strong{Manipulating the order:} You can manipulate this order or add new
files with the following two options which are fully described in
-@ref{Operating mode options}:
-@table @option
-@item --config
-Allows you to define any file to be parsed as a configuration file on the
command-line or within the any other configuration file.
-Recall that the file given to @option{--config} is parsed immediately when
this option is confronted (on the command-line or in a configuration file).
+@cindex Debug
+This is a very good option to confirm where the value of each option is has
been defined in scenarios where there are multiple configuration files (for
debugging).
-@item --lastconfig
-Allows you to stop the parsing of subsequent configuration files.
-Note that if this option is given in a configuration file, it will be fully
read, so its position in the configuration does not matter (unlike
@option{--config}).
-@end table
-@end cartouche
+@item --config-prefix=STR
+Accept option names in configuration files that start with the given prefix.
+Since order matters when reading custom configuration files, this option
should be called @strong{before} the @option{--config} option(s) that contain
options with the given prefix.
+This option does not affect the options within configuration files that have
the standard name (without a prefix).
-One example of benefiting from these configuration files can be this: raw
telescope images usually have their main image extension in the second FITS
extension, while processed FITS images usually only have one extension.
-If your system-wide default input extension is 0 (the first), then when you
want to work with the former group of data you have to explicitly mention it to
the programs every time.
-With this progressive state of default values to check, you can set different
default values for the different directories that you would like to run
Gnuastro in for your different purposes, so you will not have to worry about
this issue any more.
-
-The same can be said about the @file{gnuastro.conf} files: by specifying a
behavior in this single file, all Gnuastro programs in the respective
directory, user, or system-wide steps will behave similarly.
-For example, to keep the input's directory when no specific output is given
(see @ref{Automatic output}), or to not delete an existing file if it has the
same name as a given output (see @ref{Input output options}).
+This gives unique features to Gnuastro's configuration files, especially in
large pipelines.
+Let's demonstrate this with the simple scenario below.
+You have multiple configuration files for different instances of one program
(let's assume @file{nc-a.conf} and @file{nc-b.conf}).
+At the same time, want to load all the option names/values into your shell as
environment variables (for example with @code{source}).
+This happens when you want to use the options as shell variables in other
parts of the your pipeline.
+If the two configuration files have different values for the same option (as
shown below), and you don't use @code{--config-prefix}, the shell will
over-write the common option values between the configuration files.
+But thanks to @code{--config-prefix}, you can give a different prefix for the
different instances of the same option in different configuration files.
-@node Current directory and User wide, System wide, Configuration file
precedence, Configuration files
-@subsection Current directory and User wide
+@example
+$ cat nc-a.conf
+a_tilesize=20,20
-@cindex @file{$HOME}
-@cindex @file{./.gnuastro/}
-@cindex @file{$HOME/.local/etc/}
-For the current (local) and user-wide directories, the configuration files are
stored in the hidden sub-directories named @file{.gnuastro/} and
@file{$HOME/.local/etc/} respectively.
-Unless you have changed it, the @file{$HOME} environment variable should point
to your home directory.
-You can check it by running @command{$ echo $HOME}.
-Each time you run any of the programs in Gnuastro, this environment variable
is read and placed in the above address.
-So if you suddenly see that your home configuration files are not being read,
probably you (or some other program) has changed the value of this environment
variable.
+$ cat nc-b.conf
+b_tilesize=40,40
-@vindex --setdirconf
-@vindex --setusrconf
-Although it might cause confusions like above, this dependence on the
@file{HOME} environment variable enables you to temporarily use a different
directory as your home directory.
-This can come in handy in complicated situations.
-To set the user or current directory configuration files based on your
command-line input, you can use the @option{--setdirconf} or
@option{--setusrconf}, see @ref{Operating mode options}.
+## Load configuration files as shell scripts (to define the
+## option name and values as shell variables with values).
+## Just note that 'source' only takes one file at a time.
+$ for c in nc-*.conf; do source $c; done
+$ astnoisechisel img.fits \
+ --config=nc-a.conf --config-prefix=a_
+$ echo "NoiseChisel run with --tilesize=$a_tilesize"
+$ astnoisechisel img.fits \
+ --config=nc-b.conf --config-prefix=b_
+$ echo "NoiseChisel run with --tilesize=$b_tilesize"
+@end example
-@node System wide, , Current directory and User wide, Configuration files
-@subsection System wide
+@item -S
+@itemx --setdirconf
+Update the current directory configuration file for the Gnuastro program and
quit.
+The full set of command-line and configuration file options will be parsed and
options with a value will be written in the current directory configuration
file for this program (see @ref{Configuration files}).
+If the configuration file or its directory does not exist, it will be created.
+If a configuration file exists it will be replaced (after it, and all other
configuration files have been read).
+In any case, the program will not run.
-@cindex @file{prefix/etc/}
-@cindex System wide configuration files
-@cindex Configuration files, system wide
-When Gnuastro is installed, the configuration files that are shipped with the
distribution are copied into the (possibly system wide) @file{prefix/etc/}
directory.
-For more details on @file{prefix}, see @ref{Installation directory} (by
default it is: @file{/usr/local}).
-This directory is the final place (with the lowest priority) that the programs
in Gnuastro will check to retrieve parameter values.
+This is the recommended method@footnote{Alternatively, you can use your
favorite text editor.} to edit/set the configuration file for all future calls
to Gnuastro's programs.
+It will internally check if your values are in the correct range and type and
save them according to the configuration file format, see @ref{Configuration
file format}.
+So if there are unreasonable values to some options, the program will notify
you and abort before writing the final configuration file.
-If you remove an option and its value from the system wide configuration
files, you either have to specify it in more immediate configuration files or
set it each time in the command-line.
-Recall that none of the programs in Gnuastro keep any internal default values
and will abort if they do not find a value for the necessary parameters (except
the number of threads and output file name).
-So even though you might never expect to use an optional option, it safe to
have it available in this system-wide configuration file even if you do not
intend to use it frequently.
+When this option is called, the otherwise mandatory arguments, for
+example input image or catalog file(s), are no longer mandatory (since
+the program will not run).
-Note that in case you install Gnuastro from your distribution's repositories,
@file{prefix} will either be set to @file{/} (the root directory) or
@file{/usr}, so you can find the system wide configuration variables in
@file{/etc/} or @file{/usr/etc/}.
-The prefix of @file{/usr/local/} is conventionally used for programs you
install from source by yourself as in @ref{Quick start}.
+@item -U
+@itemx --setusrconf
+Update the user configuration file and quit (see @ref{Configuration files}).
+See explanation under @option{--setdirconf} for more details.
+@item --lastconfig
+This is the last configuration file that must be read.
+When this option is confronted in any stage of reading the options (on the
command-line or in a configuration file), no other configuration file will be
parsed, see @ref{Configuration file precedence} and @ref{Current directory and
User wide}.
+Like all on/off options, on the command-line, this option does not take any
values.
+But in a configuration file, it takes the values of @option{0} or @option{1},
see @ref{Configuration file format}.
+If it is present in a configuration file with a value of @option{0}, then all
later occurrences of this option will be ignored.
+@item --onlyversion=STR
+Only run the program if Gnuastro's version is exactly equal to @option{STR}
(see @ref{Version numbering}).
+Note that it is not compared as a number, but as a string of characters, so
@option{0}, or @option{0.0} and @option{0.00} are different.
+If the running Gnuastro version is different, then this option will report an
error and abort as soon as it is confronted on the command-line or in a
configuration file.
+If the running Gnuastro version is the same as @option{STR}, then the program
will run as if this option was not called.
+This is useful if you want your results to be exactly reproducible and not
mistakenly run with an updated/newer or older version of the program.
+Besides internal algorithmic/behavior changes in programs, the existence of
options or their names might change between versions (especially in these
earlier versions of Gnuastro).
+Hence, when using this option (probably in a script or in a configuration
file), be sure to call it before other options.
+The benefit is that, when the version differs, the other options will not be
parsed and you, or your collaborators/users, will not get errors saying an
option in your configuration does not exist in the running version of the
program.
+Here is one example of how this option can be used in conjunction with the
@option{--lastconfig} option.
+Let's assume that you were satisfied with the results of this command:
@command{astnoisechisel image.fits --snquant=0.95} (along with various options
set in various configuration files).
+You can save the state of NoiseChisel and reproduce that exact result on
@file{image.fits} later by following these steps (the extra spaces, and
@key{\}, are only for easy readability, if you want to try it out, only one
space between each token is enough).
+@example
+$ echo "onlyversion X.XX" > reproducible.conf
+$ echo "lastconfig 1" >> reproducible.conf
+$ astnoisechisel image.fits --snquant=0.95 -P \
+ >> reproducible.conf
+@end example
+@option{--onlyversion} was available from Gnuastro 0.0, so putting it
immediately at the start of a configuration file will ensure that later, you
(or others using different version) will not get a non-recognized option error
in case an option was added/removed.
+@option{--lastconfig} will inform the installed NoiseChisel to not parse any
other configuration files.
+This is done because we do not want the user's user-wide or system wide option
values affecting our results.
+Finally, with the third command, which has a @option{-P} (short for
@option{--printparams}), NoiseChisel will print all the option values visible
to it (in all the configuration files) and the shell will append them to
@file{reproduce.conf}.
+Hence, you do not have to worry about remembering the (possibly) different
options in the different configuration files.
+Afterwards, if you run NoiseChisel as shown below (telling it to read this
configuration file with the @file{--config} option).
+You can be sure that there will either be an error (for version mismatch) or
it will produce exactly the same result that you got before.
-@node Getting help, Multi-threaded operations, Configuration files, Common
program behavior
-@section Getting help
+@example
+$ astnoisechisel --config=reproducible.conf
+@end example
-@cindex Help
-@cindex Book formats
-@cindex Remembering options
-@cindex Convenient book formats
-Probably the first time you read this book, it is either in the PDF or HTML
formats.
-These two formats are very convenient for when you are not actually working,
but when you are only reading.
-Later on, when you start to use the programs and you are deep in the middle of
your work, some of the details will inevitably be forgotten.
-Going to find the PDF file (printed or digital) or the HTML web page is a
major distraction.
+@item --log
+Some programs can generate extra information about their outputs in a log file.
+When this option is called in those programs, the log file will also be
printed.
+If the program does not generate a log file, this option is ignored.
-@cindex Online help
-@cindex Command-line help
-GNU software have a very unique set of tools for aiding your memory on the
command-line, where you are working, depending how much of it you need to
remember.
-In the past, such command-line help was known as ``online'' help, because they
were literally provided to you `on' the command `line'.
-However, nowadays the word ``online'' refers to something on the internet, so
that term will not be used.
-With this type of help, you can resume your exciting research without taking
your hands off the keyboard.
+@cartouche
+@noindent
+@strong{@option{--log} is not thread-safe}: The log file usually has a fixed
name.
+Therefore if two simultaneous calls (with @option{--log}) of a program are
made in the same directory, the program will try to write to he same file.
+This will cause problems like unreasonable log file, undefined behavior, or a
crash.
+@end cartouche
-@cindex Installed help methods
-Another major advantage of such command-line based help routines is that they
are installed with the software in your computer, therefore they are always in
sync with the executable you are actually running.
-Three of them are actually part of the executable.
-You do not have to worry about the version of the book or program.
-If you rely on external help (a PDF in your personal print or digital archive
or HTML from the official web page) you have to check to see if their versions
fit with your installed program.
+@cindex CPU threads, set number
+@cindex Number of CPU threads to use
+@item -N INT
+@itemx --numthreads=INT
+Use @option{INT} CPU threads when running a Gnuastro program (see
@ref{Multi-threaded operations}).
+If the value is zero (@code{0}), or this option is not given on the
command-line or any configuration file, the value will be determined at
run-time: the maximum number of threads available to the system when you run a
Gnuastro program.
-If you only need to remember the short or long names of the options,
@option{--usage} is advised.
-If it is what the options do, then @option{--help} is a great tool.
-Man pages are also provided for those who are use to this older system of
documentation.
-This full book is also available to you on the command-line in Info format.
-If none of these seems to resolve the problems, there is a mailing list which
enables you to get in touch with experienced Gnuastro users.
-In the subsections below each of these methods are reviewed.
+Note that multi-threaded programming is only relevant to some programs.
+In others, this option will be ignored.
+@end vtable
-@menu
-* --usage:: View option names and value formats.
-* --help:: List all options with description.
-* Man pages:: Man pages generated from --help.
-* Info:: View complete book in terminal.
-* help-gnuastro mailing list:: Contacting experienced users.
-@end menu
-@node --usage, --help, Getting help, Getting help
-@subsection @option{--usage}
-@vindex --usage
-@cindex Usage pattern
-@cindex Mandatory arguments
-@cindex Optional and mandatory tokens
-If you give this option, the program will not run.
-It will only print a very concise message showing the options and arguments.
-Everything within square brackets (@option{[]}) is optional.
-For example, here are the first and last two lines of Crop's @option{--usage}
is shown:
-@example
-$ astcrop --usage
-Usage: astcrop [-Do?IPqSVW] [-d INT] [-h INT] [-r INT] [-w INT]
- [-x INT] [-y INT] [-c INT] [-p STR] [-N INT] [--deccol=INT]
- ....
- [--setusrconf] [--usage] [--version] [--wcsmode]
- [ASCIIcatalog] FITSimage(s).fits
-@end example
-There are no explanations on the options, just their short and long names
shown separately.
-After the program name, the short format of all the options that do not
require a value (on/off options) is displayed.
-Those that do require a value then follow in separate brackets, each
displaying the format of the input they want, see @ref{Options}.
-Since all options are optional, they are shown in square brackets, but
arguments can also be optional.
-For example, in this example, a catalog name is optional and is only required
in some modes.
-This is a standard method of displaying optional arguments for all GNU
software.
+@node Shell TAB completion, Standard input, Common options, Command-line
+@subsection Shell TAB completion (highly customized)
-@node --help, Man pages, --usage, Getting help
-@subsection @option{--help}
+@cartouche
+@noindent
+@strong{Under development:} Gnuastro's TAB completion in Bash already greatly
improves usage of Gnuastro on the command-line, but still under development and
not yet complete.
+If you are interested to try it out, please go ahead and activate it (as
described below), we encourage this.
+But please have in mind that there are known
issues@footnote{@url{http://savannah.gnu.org/bugs/index.php?group=gnuastro&category_id=128}}
and you may find new issues.
+If you do, please get in touch with us as described in @ref{Report a bug}.
+TAB completion is currently only implemented in the following programs:
Arithmetic, BuildProgram, ConvertType, Convolve, CosmicCalculator, Crop, Fits
and Table.
+For progress on this task, please see Task
15799@footnote{@url{https://savannah.gnu.org/task/?15799}}.
+@end cartouche
-@vindex --help
-If the command-line includes this option, the program will not be run.
-It will print a complete list of all available options along with a short
explanation.
-The options are also grouped by their context.
-Within each context, the options are sorted alphabetically.
-Since the options are shown in detail afterwards, the first line of the
@option{--help} output shows the arguments and if they are optional or not,
similar to @ref{--usage}.
+@cindex Bash auto-complete
+@cindex Completion in the shell
+@cindex Bash programmable completion
+@cindex Autocomplete (in the shell/Bash)
+Bash provides a built-in feature called @emph{programmable
completion}@footnote{@url{https://www.gnu.org/software/bash/manual/html_node/Programmable-Completion.html}}
to help increase interactive workflow efficiency and minimize the number of
key-strokes @emph{and} the need to memorize things.
+It is also known as TAB completion, bash completion, auto-completion, or word
completion.
+Completion is activated by pressing @key{[TAB]} while you are typing a command.
+For file arguments this is the default behavior already and you have probably
used it a lot with any command-line program.
-In the @option{--help} output of all programs in Gnuastro, the options for
each program are classified based on context.
-The first two contexts are always options to do with the input and output
respectively.
-For example, input image extensions or supplementary input files for the
inputs.
-The last class of options is also fixed in all of Gnuastro, it shows operating
mode options.
-Most of these options are already explained in @ref{Operating mode options}.
+Besides this simple/default mode, Bash also enables a high level of
customization features for its completion.
+These features have been extensively used in Gnuastro to improve your work
efficiency@footnote{To learn how Gnuastro implements TAB completion in Bash,
see @ref{Bash programmable completion}.}.
+For example, if you are running @code{asttable} (which only accepts files
containing a table), and you press @key{[TAB]}, it will only suggest files
containing tables.
+As another example, if an option needs image HDUs within a FITS file, pressing
@key{[TAB]} will only suggest the image HDUs (and not other possibly existing
HDUs that contain tables, or just metadata).
+Just note that the file name has to be already given on the command-line
before reaching such options (that look into the contents of a file).
-@cindex Long outputs
-@cindex Redirection of output
-@cindex Command-line, long outputs
-The help message will sometimes be longer than the vertical size of your
terminal.
-If you are using a graphical user interface terminal emulator, you can scroll
the terminal with your mouse, but we promised no mice distractions! So here are
some suggestions:
+But TAB completion is not limited to file types or contents.
+Arguments/Options that take certain fixed string values will directly suggest
those strings with TAB, and completely ignore the file structure (for example,
spectral line names in @ref{Invoking astcosmiccal})!
+As another example, the option @option{--numthreads} option (to specify the
number of threads to use by the program), will find the number of available
threads on the system, and suggest the possible numbers with a TAB!
-@itemize
-@item
-@cindex Scroll command-line
-@cindex Command-line scroll
-@cindex @key{Shift + PageUP} and @key{Shift + PageDown}
-@key{Shift + PageUP} to scroll up and @key{Shift + PageDown} to scroll down.
-For most help output this should be enough.
-The problem is that it is limited by the number of lines that your terminal
keeps in memory and that you cannot scroll by lines, only by whole screens.
+To activate Gnuastro's custom TAB completion in Bash, you need to put the
following line in one of your Bash startup files (for example,
@file{~/.bashrc}).
+If you installed Gnuastro using the steps of @ref{Quick start}, you should
have already done this (the command just after @command{sudo make install}).
+For a list of (and discussion on) Bash startup files and installation
directories see @ref{Installation directory}.
+Of course, if Gnuastro was installed in a custom location, replace the
`@file{/usr/local}' part of the line below to the value that was given to
@option{--prefix} during Gnuastro's configuration@footnote{In case you do not
know the installation directory of Gnuastro on your system, you can find out
with this command: @code{which astfits | sed -e"s|/bin/astfits||"}}.
-@item
-@cindex Pipe
-@cindex @command{less}
-Pipe to @command{less}.
-A pipe is a form of shell re-direction.
-The @command{less} tool in Unix-like systems was made exactly for such outputs
of any length.
-You can pipe (@command{|}) the output of any program that is longer than the
screen to it and then you can scroll through (up and down) with its many tools.
-For example:
@example
-$ astnoisechisel --help | less
+# Enable Gnuastro's TAB completion
+source /usr/local/share/gnuastro/completion.bash
@end example
-@noindent
-Once you have gone through the text, you can quit @command{less} by pressing
the @key{q} key.
+After adding the line above in a Bash startup file, TAB completion will always
be activated in any new terminal.
+To see if it has been activated, try it out with @command{asttable [TAB][TAB]}
and @command{astarithmetic [TAB][TAB]} in a directory that contains tables and
images.
+The first will only suggest the files with a table, and the second, only those
with an image.
-@item
-@cindex Save output to file
-@cindex Redirection of output
-Redirect to a file.
-This is a less convenient way, because you will then have to open the file in
a text editor!
-You can do this with the shell redirection tool (@command{>}):
-@example
-$ astnoisechisel --help > filename.txt
-@end example
-@end itemize
+@cartouche
+@noindent
+@strong{TAB completion only works with long option names:}
+As described above, short options are much more complex to generalize,
therefore TAB completion is only available for long options.
+But do not worry!
+TAB completion also involves option names, so if you just type
@option{--a[TAB][TAB]}, you will get the list of options that start with an
@option{--a}.
+Therefore as a side-effect of TAB completion, your commands will be far more
human-readable with minimal key strokes.
+@end cartouche
-@cindex GNU Grep
-@cindex Searching text
-@cindex Command-line searching text
-In case you have a special keyword you are looking for in the help, you do not
have to go through the full list.
-GNU Grep is made for this job.
-For example, if you only want the list of options whose @option{--help} output
contains the word ``axis'' in Crop, you can run the following command:
+
+@node Standard input, Shell tips, Shell TAB completion, Command-line
+@subsection Standard input
+
+@cindex Standard input
+@cindex Stream: standard input
+The most common way to feed the primary/first input dataset into a program is
to give its filename as an argument (discussed in @ref{Arguments}).
+When you want to run a series of programs in sequence, this means that each
will have to keep the output of each program in a separate file and re-type
that file's name in the next command.
+This can be very slow and frustrating (mis-typing a file's name).
+
+@cindex Standard output stream
+@cindex Stream: standard output
+To solve the problem, the founders of Unix defined pipes to directly feed the
output of one program (its ``Standard output'' stream) into the ``standard
input'' of a next program.
+This removes the need to make temporary files between separate processes and
became one of the best demonstrations of the Unix-way, or Unix philosophy.
+
+Every program has three streams identifying where it reads/writes non-file
inputs/outputs: @emph{Standard input}, @emph{Standard output}, and
@emph{Standard error}.
+When a program is called alone, all three are directed to the terminal that
you are using.
+If it needs an input, it will prompt you for one and you can type it in.
+Or, it prints its results in the terminal for you to see.
+
+For example, say you have a FITS table/catalog containing the B and V band
magnitudes (@code{MAG_B} and @code{MAG_V} columns) of a selection of galaxies
along with many other columns.
+If you want to see only these two columns in your terminal, can use Gnuastro's
@ref{Table} program like below:
@example
-$ astcrop --help | grep axis
+$ asttable cat.fits -cMAG_B,MAG_V
@end example
-@cindex @code{ARGP_HELP_FMT}
-@cindex Argp argument parser
-@cindex Customize @option{--help} output
-@cindex @option{--help} output customization
-If the output of this option does not fit nicely within the confines of your
terminal, GNU does enable you to customize its output through the environment
variable @code{ARGP_HELP_FMT}, you can set various parameters which specify the
formatting of the help messages.
-For example, if your terminals are wider than 70 spaces (say 100) and you feel
there is too much empty space between the long options and the short
explanation, you can change these formats by giving values to this environment
variable before running the program with the @option{--help} output.
-You can define this environment variable in this manner:
+Through the Unix pipe mechanism, when the shell confronts the pipe character
(@key{|}), it connects the standard output of the program before the pipe, to
the standard input of the program after it.
+So it is literally a ``pipe'': everything that you would see printed by the
first program on the command (without any pipe), is now passed to the second
program (and not seen by you).
+
+@cindex AWK
+@cindex GNU AWK
+To continue the previous example, let's say you want to see the B-V color.
+To do this, you can pipe Table's output to AWK (a wonderful tool for
processing things like plain text tables):
+
@example
-$ export ARGP_HELP_FMT=rmargin=100,opt-doc-col=20
+$ asttable cat.fits -cMAG_B,MAG_V | awk '@{print $1-$2@}'
@end example
-@cindex @file{.bashrc}
-This will affect all GNU programs using GNU C library's @file{argp.h}
facilities as long as the environment variable is in memory.
-You can see the full list of these formatting parameters in the ``Argp User
Customization'' part of the GNU C library manual.
-If you are more comfortable to read the @option{--help} outputs of all GNU
software in your customized format, you can add your customization (similar to
the line above, without the @command{$} sign) to your @file{~/.bashrc} file.
-This is a standard option for all GNU software.
-@node Man pages, Info, --help, Getting help
-@subsection Man pages
-@cindex Man pages
-Man pages were the Unix method of providing command-line documentation to a
program.
-With GNU Info, see @ref{Info} the usage of this method of documentation is
highly discouraged.
-This is because Info provides a much more easier to navigate and read
environment.
+But understanding the distribution by visually seeing all the numbers under
each other is not too useful! You can therefore feed this single column
information into @ref{Statistics} to give you a general feeling of the
distribution with the same command:
-However, some operating systems require a man page for packages that are
installed and some people are still used to this method of command-line help.
-So the programs in Gnuastro also have Man pages which are automatically
generated from the outputs of @option{--version} and @option{--help} using the
GNU help2man program.
-So if you run
@example
-$ man programname
+$ asttable cat.fits -cMAG_B,MAG_V | awk '@{print $1-$2@}' | aststatistics
@end example
+
+Gnuastro's programs that accept input from standard input, only look into the
Standard input stream if there is no first argument.
+In other words, arguments take precedence over Standard input.
+When no argument is provided, the programs check if the standard input stream
is already full or not (output from another program is waiting to be used).
+If data is present in the standard input stream, it is used.
+
+When the standard input is empty, the program will wait
@option{--stdintimeout} micro-seconds for you to manually enter the first line
(ending with a new-line character, or the @key{ENTER} key, see @ref{Input
output options}).
+If it detects the first line in this time, there is no more time limit, and
you can manually write/type all the lines for as long as it takes.
+To inform the program that Standard input has finished, press @key{CTRL-D}
after a new line.
+If the program does not catch the first line before the time-out finishes, it
will abort with an error saying that no input was provided.
+
+@cartouche
@noindent
-You will be provided with a man page listing the options in the
-standard manner.
+@strong{Manual input in Standard input is discarded:}
+Be careful that when you manually fill the Standard input, the data will be
discarded once the program finishes and reproducing the result will be
impossible.
+Therefore this form of providing input is only good for temporary tests.
+@end cartouche
+
+@cartouche
+@noindent
+@strong{Standard input currently only for plain text:}
+Currently Standard input only works for plain text inputs like the example
above.
+We will later allow FITS files into the programs through standard input also.
+@end cartouche
+@node Shell tips, , Standard input, Command-line
+@subsection Shell tips
+Gnuastro's programs are primarily meant to be run on the command-line shell
environment.
+In this section, we will review some useful tips and tricks that can be
helpful in the pipelines that you run.
+@menu
+* Separate shell variables for multiple outputs:: When you get values from
one command.
+@end menu
-@node Info, help-gnuastro mailing list, Man pages, Getting help
-@subsection Info
+@node Separate shell variables for multiple outputs, , Shell tips, Shell tips
+@subsubsection Separate shell variables for multiple outputs
-@cindex GNU Info
-@cindex Command-line, viewing full book
-Info is the standard documentation format for all GNU software.
-It is a very useful command-line document viewing format, fully equipped with
links between the various pages and menus and search capabilities.
-As explained before, the best thing about it is that it is available for you
the moment you need to refresh your memory on any command-line tool in the
middle of your work without having to take your hands off the keyboard.
-This complete book is available in Info format and can be accessed from
anywhere on the command-line.
+Sometimes your commands print multiple values and you want to use them as
different shell variables.
+Let's describe the problem (shown in the box below) with an example (that you
can reproduce without any external data).
-To open the Info format of any installed programs or library on your system
which has an Info format book, you can simply run the command below (change
@command{executablename} to the executable name of the program or library):
+With the commands below, we'll first make a noisy (@mymath{\sigma=5}) image
(@mymath{100\times100} pixels) using @ref{Arithmetic}.
+Then, we'll measure@footnote{The actual printed values by
@command{aststatistics} may slightly differ for you.
+This is because of a different random number generator seed used in
@command{astarithmetic}.
+To get an exactly reproducible result, see @ref{Generating random numbers}}
its mean and standard deviation using @ref{Statistics}.
@example
-$ info executablename
+$ astarithmetic 100 100 2 makenew 5 mknoise-sigma -oimg.fits
+
+$ aststatistics img.fits --mean --std
+-3.10938611484039e-03 4.99607077069093e+00
@end example
+@cartouche
@noindent
-@cindex Learning GNU Info
-@cindex GNU software documentation
-In case you are not already familiar with it, run @command{$ info info}.
-It does a fantastic job in explaining all its capabilities itself.
-It is very short and you will become sufficiently fluent in about half an hour.
-Since all GNU software documentation is also provided in Info, your whole
GNU/Linux life will significantly improve.
-
-@cindex GNU Emacs
-@cindex GNU C library
-Once you've become an efficient navigator in Info, you can go to any part of
this book or any other GNU software or library manual, no matter how long it
is, in a matter of seconds.
-It also blends nicely with GNU Emacs (a text editor) and you can search
manuals while you are writing your document or programs without taking your
hands off the keyboard, this is most useful for libraries like the GNU C
library.
-To be able to access all the Info manuals installed in your GNU/Linux within
Emacs, type @key{Ctrl-H + i}.
+@strong{THE PROBLEM:} you want the first number printed above to be stored in
a shell variable called @code{my_mean} and the second number to be stored as
the @code{my_std} shell variable (you are free to choose any name!).
+@end cartouche
-To see this whole book from the beginning in Info, you can run
+@noindent
+The first thing that may come to mind is to run Statistics two times, and
write the output into separate variables like below:
@example
-$ info gnuastro
+$ my_std=$(aststatistics img.fits --std) ## NOT SOLUTION! ##
+$ my_mean=$(aststatistics img.fits --mean) ## NOT SOLUTION! ##
@end example
-@noindent
-If you run Info with the particular program executable name, for
-example @file{astcrop} or @file{astnoisechisel}:
+@cindex Global warming
+@cindex Carbon footprint
+But this is not a good solution because as @file{img.fits} becomes larger
(more pixels), the time it takes for Statistics to simply load the data into
memory can be significant.
+This will slow down your pipeline and besides wasting your time, it
contributes to global warming (by spending energy on an un-necessary action;
take this seriously because your pipeline may scale up to involve thousands of
large datasets)!
+Furthermore, besides loading of the input data, Statistics (and Gnuastro in
general) is designed to do multiple measurements in one pass over the data as
much as possible (to further decrease Gnuastro's carbon footprint).
+So when given @option{--mean --std}, it will measure both in one pass over the
pixels (not two passes!).
+In other words, in this case, you get the two measurements for the cost of one.
+
+How do you separate the values from the first @command{aststatistics} command
above?
+One ugly way is to write the two-number output string into a single shell
variable and then separate, or tokenize, the string with two subsequent
commands like below:
+@c Note that the comments aren't aligned in the Texinfo source because of
+@c the '@' characters before the braces of AWK. In the output, they are
+@c aligned.
@example
-$ info astprogramname
+$ meanstd=$(aststatistics img.fits --mean --std) ## NOT SOLUTION! ##
+$ my_mean=$(echo $meanstd | awk '@{print $1@}') ## NOT SOLUTION! ##
+$ my_std=$(echo $meanstd | awk '@{print $2@}') ## NOT SOLUTION! ##
@end example
+@cartouche
@noindent
-you will be taken to the section titled ``Invoking ProgramName'' which
explains the inputs and outputs along with the command-line options for that
program.
-Finally, if you run Info with the official program name, for example, Crop or
NoiseChisel:
+@cindex Evaluate string as command (@command{eval})
+@cindex @command{eval} to evaluate string as command
+@strong{SOLUTION:} The solution is to formatted-print (@command{printf}) the
numbers as shell variables definitions in a string, and evaluate
(@command{eval}) that string as a command:
@example
-$ info ProgramName
+$ eval "$(aststatistics img.fits --mean --std \
+ | xargs printf "my_mean=%s; my_std=%s")"
@end example
+@end cartouche
@noindent
-you will be taken to the top section which introduces the program.
-Note that in all cases, Info is not case sensitive.
+Let's review the solution (in more detail):
+@enumerate
+@item
+@cindex Standard input
+@cindex @command{xargs} (extended arguments)
+We pipe the output into @command{xargs}@footnote{For more on @command{xargs},
see @url{https://en.wikipedia.org/wiki/Xargs}.
+It will take the standard input (from the pipe in this scenario) and put it as
arguments of the next program (@command{printf} in this scenario).
+In other words, it is good for programs that don't take input from standard
input (@command{printf} in this case; but also includes others like
@command{cp}, @command{rm}, or @command{echo}).} (extended arguments) which
puts the two numbers it gets from the pipe, as arguments for @command{printf}
(formatted print; because @command{printf} doesn't take input from pipes).
+@item
+Within the @command{printf} call, we write the values after putting a variable
name and equal-sign, and in between them we put a @key{;} (as if it was a shell
command).
+The @code{%s} tells @command{printf} to print each input as a string (not to
interpret it as a number and loose precision).
+Here is the output of this phase:
+@example
+$ aststatistics img.fits --mean --std \
+ | xargs printf "my_mean=%s; my_std=%s"
+my_mean=-3.10938611484039e-03; my_std=4.99607077069093e+00
+@end example
-@node help-gnuastro mailing list, , Info, Getting help
-@subsection help-gnuastro mailing list
+@item
+But the output above is a string! To evaluate this string as a command, we
give it to the eval command like above.
+@end enumerate
-@cindex help-gnuastro mailing list
-@cindex Mailing list: help-gnuastro
-Gnuastro maintains the help-gnuastro mailing list for users to ask any
questions related to Gnuastro.
-The experienced Gnuastro users and some of its developers are subscribed to
this mailing list and your email will be sent to them immediately.
-However, when contacting this mailing list please have in mind that they are
possibly very busy and might not be able to answer immediately.
+@noindent
+After the solution above, you will have the two @code{my_mean} and
@code{my_std} variables to use separately in your pipeline:
-@cindex Mailing list archives
-@cindex @code{help-gnuastro@@gnu.org}
-To ask a question from this mailing list, send a mail to
@code{help-gnuastro@@gnu.org}.
-Anyone can view the mailing list archives at
@url{http://lists.gnu.org/archive/html/help-gnuastro/}.
-It is best that before sending a mail, you search the archives to see if
anyone has asked a question similar to yours.
-If you want to make a suggestion or report a bug, please do not send a mail to
this mailing list.
-We have other mailing lists and tools for those purposes, see @ref{Report a
bug} or @ref{Suggest new feature}.
+@example
+$ echo $my_mean
+-3.10938611484039e-03
+$ echo $my_std
+4.99607077069093e+00
+@end example
+@cindex Zsh shell
+@cindex Dash shell
+@cindex Portable script
+This @command{eval}-based solution has been tested in in GNU Bash, Dash and
Zsh and it works nicely in them (is ``portable'').
+This is because the constructs used here are pretty low-level (and widely
available).
+For examples usages of this technique, see the following sections:
@ref{Extracting a single spectrum and plotting it} and @ref{Pseudo narrow-band
images}.
+@node Configuration files, Getting help, Command-line, Common program behavior
+@section Configuration files
+@cindex @file{etc}
+@cindex Configuration files
+@cindex Necessary parameters
+@cindex Default option values
+@cindex File system Hierarchy Standard
+Each program needs a certain number of parameters to run.
+Supplying all the necessary parameters each time you run the program is very
frustrating and prone to errors.
+Therefore all the programs read the values for the necessary options you have
not given in the command-line from one of several plain text files (which you
can view and edit with any text editor).
+These files are known as configuration files and are usually kept in a
directory named @file{etc/} according to the file system hierarchy
+standard@footnote{@url{http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard}}.
-@node Multi-threaded operations, Numeric data types, Getting help, Common
program behavior
-@section Multi-threaded operations
+@vindex --output
+@vindex --numthreads
+@cindex CPU threads, number
+@cindex Internal default value
+@cindex Number of CPU threads to use
+The thing to have in mind is that none of the programs in Gnuastro keep any
internal default value.
+All the values must either be stored in one of the configuration files or
explicitly called in the command-line.
+In case the necessary parameters are not given through any of these methods,
the program will print a missing option error and abort.
+The only exception to this is @option{--numthreads}, whose default value is
determined at run-time using the number of threads available to your system,
see @ref{Multi-threaded operations}.
+Of course, you can still provide a default value for the number of threads at
any of the levels below, but if you do not, the program will not abort.
+Also note that through automatic output name generation, the value to the
@option{--output} option is also not mandatory on the command-line or in the
configuration files for all programs which do not rely on that value as an
input@footnote{One example of a program which uses the value given to
@option{--output} as an input is ConvertType, this value specifies the type of
the output through the value to @option{--output}, see @ref{Invoking
astconvertt}.}, see @ref{Automatic output}.
-@pindex nproc
-@cindex pthread
-@cindex CPU threads
-@cindex GNU Coreutils
-@cindex Using CPU threads
-@cindex CPU, using all threads
-@cindex Multi-threaded programs
-@cindex Using multiple CPU cores
-@cindex Simultaneous multithreading
-Some of the programs benefit significantly when you use all the threads your
computer's CPU has to offer to your operating system.
-The number of threads available can be larger than the number of physical
(hardware) cores in the CPU (also known as Simultaneous multithreading).
-For example, in Intel's CPUs (those that implement its Hyper-threading
technology) the number of threads is usually double the number of physical
cores in your CPU.
-On a GNU/Linux system, the number of threads available can be found with the
command @command{$ nproc} command (part of GNU Coreutils).
-@vindex --numthreads
-@cindex Number of threads available
-@cindex Available number of threads
-@cindex Internally stored option value
-Gnuastro's programs can find the number of threads available to your system
internally at run-time (when you execute the program).
-However, if a value is given to the @option{--numthreads} option, the given
number will be used, see @ref{Operating mode options} and @ref{Configuration
files} for ways to use this option.
-Thus @option{--numthreads} is the only common option in Gnuastro's programs
with a value that does not have to be specified anywhere on the command-line or
in the configuration files.
@menu
-* A note on threads:: Caution and suggestion on using threads.
-* How to run simultaneous operations:: How to run things simultaneously.
+* Configuration file format:: ASCII format of configuration file.
+* Configuration file precedence:: Precedence of configuration files.
+* Current directory and User wide:: Local and user configuration files.
+* System wide:: System wide configuration files.
@end menu
-@node A note on threads, How to run simultaneous operations, Multi-threaded
operations, Multi-threaded operations
-@subsection A note on threads
+@node Configuration file format, Configuration file precedence, Configuration
files, Configuration files
+@subsection Configuration file format
-@cindex Using multiple threads
-@cindex Best use of CPU threads
-@cindex Efficient use of CPU threads
-Spinning off threads is not necessarily the most efficient way to run an
application.
-Creating a new thread is not a cheap operation for the operating system.
-It is most useful when the input data are fixed and you want the same
operation to be done on parts of it.
-For example, one input image to Crop and multiple crops from various parts of
it.
-In this fashion, the image is loaded into memory once, all the crops are
divided between the number of threads internally and each thread cuts out those
parts which are assigned to it from the same image.
-On the other hand, if you have multiple images and you want to crop the same
region(s) out of all of them, it is much more efficient to set
@option{--numthreads=1} (so no threads spin off) and run Crop multiple times
simultaneously, see @ref{How to run simultaneous operations}.
+@cindex Configuration file suffix
+The configuration files for each program have the standard program executable
name with a `@file{.conf}' suffix.
+When you download the source code, you can find them in the same directory as
the source code of each program, see @ref{Program source}.
-@cindex Wall-clock time
-You can check the boost in speed by first running a program on one of the data
sets with the maximum number of threads and another time (with everything else
the same) and only using one thread.
-You will notice that the wall-clock time (reported by most programs at their
end) in the former is longer than the latter divided by number of physical CPU
cores (not threads) available to your operating system.
-Asymptotically these two times can be equal (most of the time they are not).
-So limiting the programs to use only one thread and running them independently
on the number of available threads will be more efficient.
+@cindex White space character
+@cindex Configuration file format
+Any line in the configuration file whose first non-white character is a
@key{#} is considered to be a comment and is ignored.
+An empty line is also similarly ignored.
+The long name of the option should be used as an identifier.
+The option name and option value should be separated by any number of
`white-space' characters (space, tab or vertical tab) or an equal (@key{=}).
+By default several space characters are used.
+If the value of an option has space characters (most commonly for the
@option{hdu} option), then the full value can be enclosed in double quotation
signs (@key{"}, similar to the example in @ref{Arguments and options}).
+If it is an option without a value in the @option{--help} output (on/off
option, see @ref{Options}), then the value should be @option{1} if it is to be
`on' and @option{0} otherwise.
-@cindex System Cache
-@cindex Cache, system
-Note that the operating system keeps a cache of recently processed data, so
usually, the second time you process an identical data set (independent of the
number of threads used), you will get faster results.
-In order to make an unbiased comparison, you have to first clean the system's
cache with the following command between the two runs.
+In each non-commented and non-blank line, any text after the first two words
(option identifier and value) is ignored.
+If an option identifier is not recognized in the configuration file, the name
of the file, the line number of the unrecognized option, and the unrecognized
identifier name will be reported and the program will abort.
+If a parameter is repeated more more than once in the configuration files,
accepts only one value, and is not set on the command-line, then only the first
value will be used, the rest will be ignored.
-@example
-$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
-@end example
+@cindex Writing configuration files
+@cindex Automatic configuration file writing
+@cindex Configuration files, writing
+You can build or edit any of the directories and the configuration files
yourself using any text editor.
+However, it is recommended to use the @option{--setdirconf} and
@option{--setusrconf} options to set default values for the current directory
or this user, see @ref{Operating mode options}.
+With these options, the values you give will be checked before writing in the
configuration file.
+They will also print a set of commented lines guiding the reader and will also
classify the options based on their context and write them in their logical
order to be more understandable.
-@cartouche
-@noindent
-@strong{SUMMARY: Should I use multiple threads?} Depends:
-@itemize
-@item
-If you only have @strong{one} data set (image in most cases!), then yes, the
more threads you use (with a maximum of the number of threads available to your
OS) the faster you will get your results.
+@node Configuration file precedence, Current directory and User wide,
Configuration file format, Configuration files
+@subsection Configuration file precedence
+
+@cindex Configuration file precedence
+@cindex Configuration file directories
+@cindex Precedence, configuration files
+The option values in all the programs of Gnuastro will be filled in the
following order.
+If an option only takes one value which is given in an earlier step, any value
for that option in a later step will be ignored.
+Note that if the @option{lastconfig} option is specified in any step below, no
other configuration files will be parsed (see @ref{Operating mode options}).
+
+@enumerate
@item
-If you want to run the same operation on @strong{multiple} data sets, it is
best to set the number of threads to 1 and use Make, or GNU Parallel, as
explained in @ref{How to run simultaneous operations}.
-@end itemize
-@end cartouche
+Command-line options, for a particular run of ProgramName.
+@item
+@file{.gnuastro/astprogname.conf} is parsed by ProgramName in the current
directory.
+@item
+@file{.gnuastro/gnuastro.conf} is parsed by all Gnuastro programs in the
current directory.
+@item
+@file{$HOME/.local/etc/astprogname.conf} is parsed by ProgramName in the
user's home directory (see @ref{Current directory and User wide}).
+@item
+@file{$HOME/.local/etc/gnuastro.conf} is parsed by all Gnuastro programs in
the user's home directory (see @ref{Current directory and User wide}).
-@node How to run simultaneous operations, , A note on threads, Multi-threaded
operations
-@subsection How to run simultaneous operations
+@item
+@file{prefix/etc/astprogname.conf} is parsed by ProgramName in the system-wide
installation directory (see @ref{System wide} for @file{prefix}).
-There are two@footnote{A third way would be to open multiple terminal emulator
windows in your GUI, type the commands separately on each and press @key{Enter}
once on each terminal, but this is far too frustrating, tedious and prone to
errors.
-It's therefore not a realistic solution when tens, hundreds or thousands of
operations (your research targets, multiplied by the operations you do on each)
are to be done.} approaches to simultaneously execute a program: using GNU
Parallel or Make (GNU Make is the most common implementation).
-The first is very useful when you only want to do one job multiple times and
want to get back to your work without actually keeping the command you ran.
-The second is usually for more important operations, with lots of dependencies
between the different products (for example, a full scientific research).
+@item
+@file{prefix/etc/gnuastro.conf} is parsed by all Gnuastro programs in the
system-wide installation directory (see @ref{System wide} for @file{prefix}).
-@table @asis
+@end enumerate
-@item GNU Parallel
-@cindex GNU Parallel
-When you only want to run multiple instances of a command on different threads
and get on with the rest of your work, the best method is to use GNU parallel.
-Surprisingly GNU Parallel is one of the few GNU packages that has no Info
documentation but only a Man page, see @ref{Info}.
-So to see the documentation after installing it please run
+The basic idea behind setting this progressive state of checking for parameter
values is that separate users of a computer or separate folders in a user's
file system might need different values for some parameters.
-@example
-$ man parallel
-@end example
+@cartouche
@noindent
-As an example, let's assume we want to crop a region fixed on the pixels (500,
600) with the default width from all the FITS images in the @file{./data}
directory ending with @file{sci.fits} to the current directory.
-To do this, you can run:
+@strong{Checking the order:}
+You can confirm/check the order of parsing configuration files using the
@option{--checkconfig} option with any Gnuastro program, see @ref{Operating
mode options}.
+Just be sure to place this option immediately after the program name, before
any other option.
+@end cartouche
-@example
-$ parallel astcrop --numthreads=1 --xc=500 --yc=600 ::: \
- ./data/*sci.fits
-@end example
+As you see above, there can also be a configuration file containing the common
options in all the programs: @file{gnuastro.conf} (see @ref{Common options}).
+If options specific to one program are specified in this file, there will be
unrecognized option errors, or unexpected behavior if the option has different
behavior in another program.
+On the other hand, there is no problem with @file{astprogname.conf} containing
common options@footnote{As an example, the @option{--setdirconf} and
@option{--setusrconf} options will also write the common options they have read
in their produced @file{astprogname.conf}.}.
+@cartouche
@noindent
-GNU Parallel can help in many more conditions, this is one of the simplest,
see the man page for lots of other examples.
-For absolute beginners: the backslash (@command{\}) is only a line breaker to
fit nicely in the page.
-If you type the whole command in one line, you should remove it.
+@strong{Manipulating the order:} You can manipulate this order or add new
files with the following two options which are fully described in
+@ref{Operating mode options}:
+@table @option
+@item --config
+Allows you to define any file to be parsed as a configuration file on the
command-line or within the any other configuration file.
+Recall that the file given to @option{--config} is parsed immediately when
this option is confronted (on the command-line or in a configuration file).
-@item Make
-@cindex Make
-Make is a program for building ``targets'' (e.g., files) using ``recipes'' (a
set of operations) when their known ``prerequisites'' (other files) have been
updated.
-It elegantly allows you to define dependency structures for building your
final output and updating it efficiently when the inputs change.
-It is the most common infra-structure to build software today.
+@item --lastconfig
+Allows you to stop the parsing of subsequent configuration files.
+Note that if this option is given in a configuration file, it will be fully
read, so its position in the configuration does not matter (unlike
@option{--config}).
+@end table
+@end cartouche
-Scientific research methodology is very similar to software development: you
start by testing a hypothesis on a small sample of objects/targets with a
simple set of steps.
-As you are able to get promising results, you improve the method and use it on
a larger, more general, sample.
-In the process, you will confront many issues that have to be corrected (bugs
in software development jargon).
-Make is a wonderful tool to manage this style of development.
+One example of benefiting from these configuration files can be this: raw
telescope images usually have their main image extension in the second FITS
extension, while processed FITS images usually only have one extension.
+If your system-wide default input extension is 0 (the first), then when you
want to work with the former group of data you have to explicitly mention it to
the programs every time.
+With this progressive state of default values to check, you can set different
default values for the different directories that you would like to run
Gnuastro in for your different purposes, so you will not have to worry about
this issue any more.
-Besides the raw data analysis pipeline, Make has been used to for producing
reproducible papers, for example, see
@url{https://gitlab.com/makhlaghi/NoiseChisel-paper, the reproduction pipeline}
of the paper introducing @ref{NoiseChisel} (one of Gnuastro's programs).
-In fact the NoiseChisel paper's Make-based workflow was the foundation of a
parallel project called @url{http://maneage.org,Maneage} (@emph{Man}aging data
lin@emph{eage}): @url{http://maneage.org} that is described more fully in
Akhlaghi et al. @url{https://arxiv.org/abs/2006.03018, 2021}.
-Therefore, it is a very useful tool for complex scientific workflows.
+The same can be said about the @file{gnuastro.conf} files: by specifying a
behavior in this single file, all Gnuastro programs in the respective
directory, user, or system-wide steps will behave similarly.
+For example, to keep the input's directory when no specific output is given
(see @ref{Automatic output}), or to not delete an existing file if it has the
same name as a given output (see @ref{Input output options}).
-@cindex GNU Make
-GNU Make@footnote{@url{https://www.gnu.org/software/make/}} is the most common
implementation which (similar to nearly all GNU programs, comes with a
wonderful manual@footnote{@url{https://www.gnu.org/software/make/manual/}}).
-Make is very basic and simple, and thus the manual is short (the most
important parts are in the first roughly 100 pages) and easy to read/understand.
-Make comes with a @option{--jobs} (@option{-j}) option which allows you to
specify the maximum number of jobs that can be done simultaneously.
-For example, if you have 8 threads available to your operating system.
-You can run:
+@node Current directory and User wide, System wide, Configuration file
precedence, Configuration files
+@subsection Current directory and User wide
-@example
-$ make -j8
-@end example
+@cindex @file{$HOME}
+@cindex @file{./.gnuastro/}
+@cindex @file{$HOME/.local/etc/}
+For the current (local) and user-wide directories, the configuration files are
stored in the hidden sub-directories named @file{.gnuastro/} and
@file{$HOME/.local/etc/} respectively.
+Unless you have changed it, the @file{$HOME} environment variable should point
to your home directory.
+You can check it by running @command{$ echo $HOME}.
+Each time you run any of the programs in Gnuastro, this environment variable
is read and placed in the above address.
+So if you suddenly see that your home configuration files are not being read,
probably you (or some other program) has changed the value of this environment
variable.
-With this command, Make will process your @file{Makefile} and create all the
targets (can be thousands of FITS images for example) simultaneously on 8
threads, while fully respecting their dependencies (only building a file/target
when its prerequisites are successfully built).
-Make is thus strongly recommended for managing scientific research where
robustness, archiving, reproducibility and speed@footnote{Besides its
multi-threaded capabilities, Make will only re-build those targets that depend
on a change you have made, not the whole work.
-For example, if you have set the prerequisites properly, you can easily test
the changing of a parameter on your paper's results without having to re-do
everything (which is much faster).
-This allows you to be much more productive in easily checking various
ideas/assumptions of the different stages of your research and thus produce a
more robust result for your exciting science.} are important.
+@vindex --setdirconf
+@vindex --setusrconf
+Although it might cause confusions like above, this dependence on the
@file{HOME} environment variable enables you to temporarily use a different
directory as your home directory.
+This can come in handy in complicated situations.
+To set the user or current directory configuration files based on your
command-line input, you can use the @option{--setdirconf} or
@option{--setusrconf}, see @ref{Operating mode options}.
-@end table
+@node System wide, , Current directory and User wide, Configuration files
+@subsection System wide
+@cindex @file{prefix/etc/}
+@cindex System wide configuration files
+@cindex Configuration files, system wide
+When Gnuastro is installed, the configuration files that are shipped with the
distribution are copied into the (possibly system wide) @file{prefix/etc/}
directory.
+For more details on @file{prefix}, see @ref{Installation directory} (by
default it is: @file{/usr/local}).
+This directory is the final place (with the lowest priority) that the programs
in Gnuastro will check to retrieve parameter values.
+If you remove an option and its value from the system wide configuration
files, you either have to specify it in more immediate configuration files or
set it each time in the command-line.
+Recall that none of the programs in Gnuastro keep any internal default values
and will abort if they do not find a value for the necessary parameters (except
the number of threads and output file name).
+So even though you might never expect to use an optional option, it safe to
have it available in this system-wide configuration file even if you do not
intend to use it frequently.
-@node Numeric data types, Memory management, Multi-threaded operations, Common
program behavior
-@section Numeric data types
+Note that in case you install Gnuastro from your distribution's repositories,
@file{prefix} will either be set to @file{/} (the root directory) or
@file{/usr}, so you can find the system wide configuration variables in
@file{/etc/} or @file{/usr/etc/}.
+The prefix of @file{/usr/local/} is conventionally used for programs you
install from source by yourself as in @ref{Quick start}.
-@cindex Bit
-@cindex Type
-At the lowest level, the computer stores everything in terms of @code{1} or
@code{0}.
-For example, each program in Gnuastro, or each astronomical image you take
with the telescope is actually a string of millions of these zeros and ones.
-The space required to keep a zero or one is the smallest unit of storage, and
is known as a @emph{bit}.
-However, understanding and manipulating this string of bits is extremely hard
for most people.
-Therefore, different standards are defined to package the bits into separate
@emph{type}s with a fixed interpretation of the bits in each package.
-@cindex Byte
-@cindex Signed integer
-@cindex Unsigned integer
-@cindex Integer, Signed
-To store numbers, the most basic standard/type is for integers (@mymath{...,
-2, -1, 0, 1, 2, ...}).
-The common integer types are 8, 16, 32, and 64 bits wide (more bits will give
larger limits).
-Each bit corresponds to a power of 2 and they are summed to create the final
number.
-In the integer types, for each width there are two standards for reading the
bits: signed and unsigned.
-In the `signed' convention, one bit is reserved for the sign (stating that the
integer is positive or negative).
-The `unsigned' integers use that bit in the actual number and thus contain
only positive numbers (starting from zero).
-Therefore, at the same number of bits, both signed and unsigned integers can
allow the same number of integers, but the positive limit of the
@code{unsigned} types is double their @code{signed} counterparts with the same
width (at the expense of not having negative numbers).
-When the context of your work does not involve negative numbers (for example,
counting, where negative is not defined), it is best to use the @code{unsigned}
types.
-For the full numerical range of all integer types, see below.
-Another standard of converting a given number of bits to numbers is the
floating point standard, this standard can @emph{approximately} store any real
number with a given precision.
-There are two common floating point types: 32-bit and 64-bit, for single and
double precision floating point numbers respectively.
-The former is sufficient for data with less than 8 significant decimal digits
(most astronomical data), while the latter is good for less than 16 significant
decimal digits.
-The representation of real numbers as bits is much more complex than integers.
-If you are interested to learn more about it, you can start with the
@url{https://en.wikipedia.org/wiki/Floating_point, Wikipedia article}.
-Practically, you can use Gnuastro's Arithmetic program to convert/change the
type of an image/datacube (see @ref{Arithmetic}), or Gnuastro Table program to
convert a table column's data type (see @ref{Column arithmetic}).
-Conversion of a dataset's type is necessary in some contexts.
-For example, the program/library, that you intend to feed the data into, only
accepts floating point values, but you have an integer image/column.
-Another situation that conversion can be helpful is when you know that your
data only has values that fit within @code{int8} or @code{uint16}.
-However it is currently formatted in the @code{float64} type.
-The important thing to consider is that operations involving wider, floating
point, or signed types can be significantly slower than smaller-width, integer,
or unsigned types respectively.
-Note that besides speed, a wider type also requires much more storage space
(by 4 or 8 times).
-Therefore, when you confront such situations that can be optimized and want to
store/archive/transfer the data, it is best to use the most efficient type.
-For example, if your dataset (image or table column) only has positive
integers less than 65535, store it as an unsigned 16-bit integer for faster
processing, faster transfer, and less storage space.
-The short and long names for the recognized numeric data types in Gnuastro are
listed below.
-Both short and long names can be used when you want to specify a type.
-For example, as a value to the common option @option{--type} (see @ref{Input
output options}), or in the information comment lines of @ref{Gnuastro text
table format}.
-The ranges listed below are inclusive.
-@table @code
-@item u8
-@itemx uint8
-8-bit unsigned integers, range:@*
-@mymath{[0\rm{\ to\ }2^8-1]} or @mymath{[0\rm{\ to\ }255]}.
-@item i8
-@itemx int8
-8-bit signed integers, range:@*
-@mymath{[-2^7\rm{\ to\ }2^7-1]} or @mymath{[-128\rm{\ to\ }127]}.
-@item u16
-@itemx uint16
-16-bit unsigned integers, range:@*
-@mymath{[0\rm{\ to\ }2^{16}-1]} or @mymath{[0\rm{\ to\ }65535]}.
+@node Getting help, Multi-threaded operations, Configuration files, Common
program behavior
+@section Getting help
-@item i16
-@itemx int16
-16-bit signed integers, range:@* @mymath{[-2^{15}\rm{\ to\ }2^{15}-1]} or
-@mymath{[-32768\rm{\ to\ }32767]}.
+@cindex Help
+@cindex Book formats
+@cindex Remembering options
+@cindex Convenient book formats
+Probably the first time you read this book, it is either in the PDF or HTML
formats.
+These two formats are very convenient for when you are not actually working,
but when you are only reading.
+Later on, when you start to use the programs and you are deep in the middle of
your work, some of the details will inevitably be forgotten.
+Going to find the PDF file (printed or digital) or the HTML web page is a
major distraction.
-@item u32
-@itemx uint32
-32-bit unsigned integers, range:@* @mymath{[0\rm{\ to\ }2^{32}-1]} or
-@mymath{[0\rm{\ to\ }4294967295]}.
+@cindex Online help
+@cindex Command-line help
+GNU software have a very unique set of tools for aiding your memory on the
command-line, where you are working, depending how much of it you need to
remember.
+In the past, such command-line help was known as ``online'' help, because they
were literally provided to you `on' the command `line'.
+However, nowadays the word ``online'' refers to something on the internet, so
that term will not be used.
+With this type of help, you can resume your exciting research without taking
your hands off the keyboard.
-@item i32
-@itemx int32
-32-bit signed integers, range:@* @mymath{[-2^{31}\rm{\ to\ }2^{31}-1]} or
-@mymath{[-2147483648\rm{\ to\ }2147483647]}.
+@cindex Installed help methods
+Another major advantage of such command-line based help routines is that they
are installed with the software in your computer, therefore they are always in
sync with the executable you are actually running.
+Three of them are actually part of the executable.
+You do not have to worry about the version of the book or program.
+If you rely on external help (a PDF in your personal print or digital archive
or HTML from the official web page) you have to check to see if their versions
fit with your installed program.
-@item u64
-@itemx uint64
-64-bit unsigned integers, range@* @mymath{[0\rm{\ to\ }2^{64}-1]} or
-@mymath{[0\rm{\ to\ }18446744073709551615]}.
+If you only need to remember the short or long names of the options,
@option{--usage} is advised.
+If it is what the options do, then @option{--help} is a great tool.
+Man pages are also provided for those who are use to this older system of
documentation.
+This full book is also available to you on the command-line in Info format.
+If none of these seems to resolve the problems, there is a mailing list which
enables you to get in touch with experienced Gnuastro users.
+In the subsections below each of these methods are reviewed.
-@item i64
-@itemx int64
-64-bit signed integers, range:@* @mymath{[-2^{63}\rm{\ to\ }2^{63}-1]} or
-@mymath{[-9223372036854775808\rm{\ to\ }9223372036854775807]}.
-@item f32
-@itemx float32
-32-bit (single-precision) floating point types.
-The maximum (minimum is its negative) possible value is
@mymath{3.402823\times10^{38}}.
-Single-precision floating points can accurately represent a floating point
number up to @mymath{\sim7.2} significant decimals.
-Given the heavy noise in astronomical data, this is usually more than
sufficient for storing results.
-For more, see @ref{Printing floating point numbers}.
+@menu
+* --usage:: View option names and value formats.
+* --help:: List all options with description.
+* Man pages:: Man pages generated from --help.
+* Info:: View complete book in terminal.
+* help-gnuastro mailing list:: Contacting experienced users.
+@end menu
-@item f64
-@itemx float64
-64-bit (double-precision) floating point types.
-The maximum (minimum is its negative) possible value is @mymath{\sim10^{308}}.
-Double-precision floating points can accurately represent a floating point
number @mymath{\sim15.9} significant decimals.
-This is usually good for processing (mixing) the data internally, for example,
a sum of single precision data (and later storing the result as @code{float32}).
-For more, see @ref{Printing floating point numbers}.
-@end table
+@node --usage, --help, Getting help, Getting help
+@subsection @option{--usage}
+@vindex --usage
+@cindex Usage pattern
+@cindex Mandatory arguments
+@cindex Optional and mandatory tokens
+If you give this option, the program will not run.
+It will only print a very concise message showing the options and arguments.
+Everything within square brackets (@option{[]}) is optional.
+For example, here are the first and last two lines of Crop's @option{--usage}
is shown:
-@cartouche
-@noindent
-@strong{Some file formats do not recognize all types.} for example, the FITS
standard (see @ref{Fits}) does not define @code{uint64} in binary tables or
images.
-When a type is not acceptable for output into a given file format, the
respective Gnuastro program or library will let you know and abort.
-On the command-line, you can convert the numerical type of an image, or table
column into another type with @ref{Arithmetic} or @ref{Table} respectively.
-If you are writing your own program, you can use the
@code{gal_data_copy_to_new_type()} function in Gnuastro's library, see
@ref{Copying datasets}.
-@end cartouche
+@example
+$ astcrop --usage
+Usage: astcrop [-Do?IPqSVW] [-d INT] [-h INT] [-r INT] [-w INT]
+ [-x INT] [-y INT] [-c INT] [-p STR] [-N INT] [--deccol=INT]
+ ....
+ [--setusrconf] [--usage] [--version] [--wcsmode]
+ [ASCIIcatalog] FITSimage(s).fits
+@end example
+There are no explanations on the options, just their short and long names
shown separately.
+After the program name, the short format of all the options that do not
require a value (on/off options) is displayed.
+Those that do require a value then follow in separate brackets, each
displaying the format of the input they want, see @ref{Options}.
+Since all options are optional, they are shown in square brackets, but
arguments can also be optional.
+For example, in this example, a catalog name is optional and is only required
in some modes.
+This is a standard method of displaying optional arguments for all GNU
software.
+@node --help, Man pages, --usage, Getting help
+@subsection @option{--help}
-@node Memory management, Tables, Numeric data types, Common program behavior
-@section Memory management
+@vindex --help
+If the command-line includes this option, the program will not be run.
+It will print a complete list of all available options along with a short
explanation.
+The options are also grouped by their context.
+Within each context, the options are sorted alphabetically.
+Since the options are shown in detail afterwards, the first line of the
@option{--help} output shows the arguments and if they are optional or not,
similar to @ref{--usage}.
-@cindex Memory management
-@cindex Non-volatile memory
-@cindex Memory, non-volatile
-In this section we will review how Gnuastro manages your input data in your
system's memory.
-Knowing this can help you optimize your usage (in speed and memory
consumption) when the data volume is large and approaches, or exceeds, your
available RAM (usually in various calls to multiple programs simultaneously).
-But before diving into the details, let's have a short basic introduction to
memory in general and in particular the types of memory most relevant to this
discussion.
+In the @option{--help} output of all programs in Gnuastro, the options for
each program are classified based on context.
+The first two contexts are always options to do with the input and output
respectively.
+For example, input image extensions or supplementary input files for the
inputs.
+The last class of options is also fixed in all of Gnuastro, it shows operating
mode options.
+Most of these options are already explained in @ref{Operating mode options}.
-Input datasets (that are later fed into programs for analysis) are commonly
first stored in @emph{non-volatile memory}.
-This is a type of memory that does not need a constant power supply to keep
the data and is therefore primarily aimed for long-term storage, like HDDs or
SSDs.
-So data in this type of storage is preserved when you turn off your computer.
-But by its nature, non-volatile memory is much slower, in reading or writing,
than the speeds that CPUs can process the data.
-Thus relying on this type of memory alone would create a bad bottleneck in the
input/output (I/O) phase of any processing.
+@cindex Long outputs
+@cindex Redirection of output
+@cindex Command-line, long outputs
+The help message will sometimes be longer than the vertical size of your
terminal.
+If you are using a graphical user interface terminal emulator, you can scroll
the terminal with your mouse, but we promised no mice distractions! So here are
some suggestions:
-@cindex RAM
-@cindex Volatile memory
-@cindex Memory, volatile
-The first step to decrease this bottleneck is to have a faster storage space,
but with a much limited storage volume.
-For this type of storage, computers have a Random Access Memory (or RAM).
-RAM is classified as a @emph{volatile memory} because it needs a constant flow
of electricity to keep the information.
-In other words, the moment power is cut-off, all the stored information in
your RAM is gone (hence the ``volatile'' name).
-But thanks to that constant supply of power, it can access any random address
with equal (and very high!) speed.
+@itemize
+@item
+@cindex Scroll command-line
+@cindex Command-line scroll
+@cindex @key{Shift + PageUP} and @key{Shift + PageDown}
+@key{Shift + PageUP} to scroll up and @key{Shift + PageDown} to scroll down.
+For most help output this should be enough.
+The problem is that it is limited by the number of lines that your terminal
keeps in memory and that you cannot scroll by lines, only by whole screens.
-Hence, the general/simplistic way that programs deal with memory is the
following (this is general to almost all programs, not just Gnuastro's):
-1) Load/copy the input data from the non-volatile memory into RAM.
-2) Use the copy of the data in RAM as input for all the internal processing as
well as the intermediate data that is necessary during the processing.
-3) Finally, when the analysis is complete, write the final output data back
into non-volatile memory, and free/delete all the used space in the RAM (the
initial copy and all the intermediate data).
-Usually the RAM is most important for the data of the intermediate steps (that
you never see as a user of a program!).
+@item
+@cindex Pipe
+@cindex @command{less}
+Pipe to @command{less}.
+A pipe is a form of shell re-direction.
+The @command{less} tool in Unix-like systems was made exactly for such outputs
of any length.
+You can pipe (@command{|}) the output of any program that is longer than the
screen to it and then you can scroll through (up and down) with its many tools.
+For example:
+@example
+$ astnoisechisel --help | less
+@end example
+@noindent
+Once you have gone through the text, you can quit @command{less} by pressing
the @key{q} key.
-When the input dataset(s) to a program are small (compared to the available
space in your system's RAM at the moment it is run) Gnuastro's programs and
libraries follow the standard series of steps above.
-The only exception is that deleting the intermediate data is not only done at
the end of the program.
-As soon as an intermediate dataset is no longer necessary for the next
internal steps, the space it occupied is deleted/freed.
-This allows Gnuastro programs to minimize their usage of your system's RAM
over the full running time.
-The situation gets complicated when the datasets are large (compared to your
available RAM when the program is run).
-For example, if a dataset is half the size of your system's available RAM, and
the program's internal analysis needs three or more intermediately processed
copies of it at one moment in its analysis.
-There will not be enough RAM to keep those higher-level intermediate data.
-In such cases, programs that do not do any memory management will crash.
-But fortunately Gnuastro's programs do have a memory management plans for such
situations.
+@item
+@cindex Save output to file
+@cindex Redirection of output
+Redirect to a file.
+This is a less convenient way, because you will then have to open the file in
a text editor!
+You can do this with the shell redirection tool (@command{>}):
+@example
+$ astnoisechisel --help > filename.txt
+@end example
+@end itemize
-@cindex Memory-mapped file
-When the necessary amount of space for an intermediate dataset cannot be
allocated in the RAM, Gnuastro's programs will not use the RAM at all.
-They will use the ``memory-mapped file'' concept in modern operating systems
to create a randomly-named file in your non-volatile memory and use that
instead of the RAM.
-That file will have the exact size (in bytes) of that intermediate dataset.
-Any time the program needs that intermediate dataset, the operating system
will directly go to that file, and bypass your RAM.
-As soon as that file is no longer necessary for the analysis, it will be
deleted.
-But as mentioned above, non-volatile memory has much slower I/O speed than the
RAM.
-Hence in such situations, the programs will become noticeably slower
(sometimes by factors of 10 times slower, depending on your non-volatile memory
speed).
+@cindex GNU Grep
+@cindex Searching text
+@cindex Command-line searching text
+In case you have a special keyword you are looking for in the help, you do not
have to go through the full list.
+GNU Grep is made for this job.
+For example, if you only want the list of options whose @option{--help} output
contains the word ``axis'' in Crop, you can run the following command:
-Because of the drop in I/O speed (and thus the speed of your running program),
the moment that any to-be-allocated dataset is memory-mapped, Gnuastro's
programs and libraries will notify you with a descriptive statement like below
(can happen in any phase of their analysis).
-It shows the location of the memory-mapped file, its size, complemented with a
small description of the cause, a pointer to this section of the book for more
information on how to deal with it (if necessary), and what to do to suppress
it.
+@example
+$ astcrop --help | grep axis
+@end example
+@cindex @code{ARGP_HELP_FMT}
+@cindex Argp argument parser
+@cindex Customize @option{--help} output
+@cindex @option{--help} output customization
+If the output of this option does not fit nicely within the confines of your
terminal, GNU does enable you to customize its output through the environment
variable @code{ARGP_HELP_FMT}, you can set various parameters which specify the
formatting of the help messages.
+For example, if your terminals are wider than 70 spaces (say 100) and you feel
there is too much empty space between the long options and the short
explanation, you can change these formats by giving values to this environment
variable before running the program with the @option{--help} output.
+You can define this environment variable in this manner:
@example
-astarithmetic: ./gnuastro_mmap/Fu7Dhs: temporary memory-mapped file
-(XXXXXXXXXXX bytes) created for intermediate data that is not stored
-in RAM (see the "Memory management" section of Gnuastro's manual for
-optimizing your project's memory management, and thus speed). To
-disable this warning, please use the option '--quiet-mmap'
+$ export ARGP_HELP_FMT=rmargin=100,opt-doc-col=20
@end example
+@cindex @file{.bashrc}
+This will affect all GNU programs using GNU C library's @file{argp.h}
facilities as long as the environment variable is in memory.
+You can see the full list of these formatting parameters in the ``Argp User
Customization'' part of the GNU C library manual.
+If you are more comfortable to read the @option{--help} outputs of all GNU
software in your customized format, you can add your customization (similar to
the line above, without the @command{$} sign) to your @file{~/.bashrc} file.
+This is a standard option for all GNU software.
-@noindent
-Finally, when the intermediate dataset is no longer necessary, the program
will automatically delete it and notify you with a statement like this:
+@node Man pages, Info, --help, Getting help
+@subsection Man pages
+@cindex Man pages
+Man pages were the Unix method of providing command-line documentation to a
program.
+With GNU Info, see @ref{Info} the usage of this method of documentation is
highly discouraged.
+This is because Info provides a much more easier to navigate and read
environment.
+However, some operating systems require a man page for packages that are
installed and some people are still used to this method of command-line help.
+So the programs in Gnuastro also have Man pages which are automatically
generated from the outputs of @option{--version} and @option{--help} using the
GNU help2man program.
+So if you run
@example
-astarithmetic: ./gnuastro_mmap/Fu7Dhs: deleted
+$ man programname
@end example
-
@noindent
-To disable these messages, you can run the program with @code{--quietmmap}, or
set the @code{quietmmap} variable in the allocating library function to be
non-zero.
+You will be provided with a man page listing the options in the
+standard manner.
-An important component of these messages is the name of the memory-mapped file.
-Knowing that the file has been deleted is important for the user if the
program crashes for any reason: internally (for example, a parameter is given
wrongly) or externally (for example, you mistakenly kill the running job).
-In the event of a crash, the memory-mapped files will not be deleted and you
have to manually delete them because they are usually large and they may soon
fill your full storage if not deleted in a long time due to successive crashes.
-This brings us to managing the memory-mapped files in your non-volatile memory.
-In other words: knowing where they are saved, or intentionally placing them in
different places of your file system, or deleting them when necessary.
-As the examples above show, memory-mapped files are stored in a sub-directory
of the running directory called @file{gnuastro_mmap}.
-If this directory does not exist, Gnuastro will automatically create it when
memory mapping becomes necessary.
-Alternatively, it may happen that the @file{gnuastro_mmap} sub-directory
exists and is not writable, or it cannot be created.
-In such cases, the memory-mapped file for each dataset will be created in the
running directory with a @file{gnuastro_mmap_} prefix.
-Therefore one easy way to delete all memory-mapped files in case of a crash,
is to delete everything within the sub-directory (first command below), or all
files stating with this prefix:
+
+
+@node Info, help-gnuastro mailing list, Man pages, Getting help
+@subsection Info
+
+@cindex GNU Info
+@cindex Command-line, viewing full book
+Info is the standard documentation format for all GNU software.
+It is a very useful command-line document viewing format, fully equipped with
links between the various pages and menus and search capabilities.
+As explained before, the best thing about it is that it is available for you
the moment you need to refresh your memory on any command-line tool in the
middle of your work without having to take your hands off the keyboard.
+This complete book is available in Info format and can be accessed from
anywhere on the command-line.
+
+To open the Info format of any installed programs or library on your system
which has an Info format book, you can simply run the command below (change
@command{executablename} to the executable name of the program or library):
@example
-rm -f gnuastro_mmap/*
-rm -f gnuastro_mmap_*
+$ info executablename
@end example
-A much more common issue when dealing with memory-mapped files is their
location.
-For example, you may be running a program in a partition that is hosted by an
HDD.
-But you also have another partition on an SSD (which has much faster I/O).
-So you want your memory-mapped files to be created in the SSD to speed up your
processing.
-In this scenario, you want your project source directory to only contain your
plain-text scripts and you want your project's built products (even the
temporary memory-mapped files) to be built in a different location because they
are large; thus I/O speed becomes important.
+@noindent
+@cindex Learning GNU Info
+@cindex GNU software documentation
+In case you are not already familiar with it, run @command{$ info info}.
+It does a fantastic job in explaining all its capabilities itself.
+It is very short and you will become sufficiently fluent in about half an hour.
+Since all GNU software documentation is also provided in Info, your whole
GNU/Linux life will significantly improve.
-To host the memory-mapped files in another location (with fast I/O), you can
set (@file{gnuastro_mmap}) to be a symbolic link to it.
-For example, let's assume you want your memory-mapped files to be stored in
@file{/path/to/dir/for/mmap}.
-All you have to do is to run the following command before your Gnuastro
analysis command(s).
+@cindex GNU Emacs
+@cindex GNU C library
+Once you've become an efficient navigator in Info, you can go to any part of
this book or any other GNU software or library manual, no matter how long it
is, in a matter of seconds.
+It also blends nicely with GNU Emacs (a text editor) and you can search
manuals while you are writing your document or programs without taking your
hands off the keyboard, this is most useful for libraries like the GNU C
library.
+To be able to access all the Info manuals installed in your GNU/Linux within
Emacs, type @key{Ctrl-H + i}.
+
+To see this whole book from the beginning in Info, you can run
@example
-ln -s /path/to/dir/for/mmap gnuastro_mmap
+$ info gnuastro
@end example
-The programs will delete a memory-mapped file when it is no longer needed, but
they will not delete the @file{gnuastro_mmap} directory that hosts them.
-So if your project involves many Gnuastro programs (possibly called in
parallel) and you want your memory-mapped files to be in a different location,
you just have to make the symbolic link above once at the start, and all the
programs will use it if necessary.
+@noindent
+If you run Info with the particular program executable name, for
+example @file{astcrop} or @file{astnoisechisel}:
-Another memory-management scenario that may happen is this: you do not want a
Gnuastro program to allocate internal datasets in the RAM at all.
-For example, the speed of your Gnuastro-related project does not matter at
that moment, and you have higher-priority jobs that are being run at the same
time which need to have RAM available.
-In such cases, you can use the @option{--minmapsize} option that is available
in all Gnuastro programs (see @ref{Processing options}).
-Any intermediate dataset that has a size larger than the value of this option
will be memory-mapped, even if there is space available in your RAM.
-For example, if you want any dataset larger than 100 megabytes to be
memory-mapped, use @option{--minmapsize=100000000} (8 zeros!).
+@example
+$ info astprogramname
+@end example
-@cindex Linux kernel
-@cindex Kernel, Linux
-You should not set the value of @option{--minmapsize} to be too small,
otherwise even small intermediate values (that are usually very numerous) in
the program will be memory-mapped.
-However the kernel can only host a limited number of memory-mapped files at
every moment (by all running programs combined).
-For example, in the default@footnote{If you need to host more memory-mapped
files at one moment, you need to build your own customized Linux kernel.} Linux
kernel on GNU/Linux operating systems this limit is roughly 64000.
-If the total number of memory-mapped files exceeds this number, all the
programs using them will crash.
-Gnuastro's programs will warn you if your given value is too small and may
cause a problem later.
+@noindent
+you will be taken to the section titled ``Invoking ProgramName'' which
explains the inputs and outputs along with the command-line options for that
program.
+Finally, if you run Info with the official program name, for example, Crop or
NoiseChisel:
-Actually, the default behavior for Gnuastro's programs (to only use
memory-mapped files when there is not enough RAM) is a side-effect of
@option{--minmapsize}.
-The pre-defined value to this option is an extremely large value in the
lowest-level Gnuastro configuration file (the installed @file{gnuastro.conf}
described in @ref{Configuration file precedence}).
-This value is larger than the largest possible available RAM.
-You can check by running any Gnuastro program with a @option{-P} option.
-Because no dataset will be larger than this, by default the programs will
first attempt to use the RAM for temporary storage.
-But if writing in the RAM fails (for any reason, mainly due to lack of
available space), then a memory-mapped file will be created.
+@example
+$ info ProgramName
+@end example
+@noindent
+you will be taken to the top section which introduces the program.
+Note that in all cases, Info is not case sensitive.
+@node help-gnuastro mailing list, , Info, Getting help
+@subsection help-gnuastro mailing list
-@node Tables, Tessellation, Memory management, Common program behavior
-@section Tables
+@cindex help-gnuastro mailing list
+@cindex Mailing list: help-gnuastro
+Gnuastro maintains the help-gnuastro mailing list for users to ask any
questions related to Gnuastro.
+The experienced Gnuastro users and some of its developers are subscribed to
this mailing list and your email will be sent to them immediately.
+However, when contacting this mailing list please have in mind that they are
possibly very busy and might not be able to answer immediately.
-``A table is a collection of related data held in a structured format within a
database.
-It consists of columns, and rows.'' (from Wikipedia).
-Each column in the table contains the values of one property and each row is a
collection of properties (columns) for one target object.
-For example, let's assume you have just ran MakeCatalog (see
@ref{MakeCatalog}) on an image to measure some properties for the labeled
regions (which might be detected galaxies for example) in the image.
-For each labeled region (detected galaxy), there will be a @emph{row} which
groups its measured properties as @emph{columns}, one column for each property.
-One such property can be the object's magnitude, which is the sum of pixels
with that label, or its center can be defined as the light-weighted average
value of those pixels.
-Many such properties can be derived from the raw pixel values and their
position, see @ref{Invoking astmkcatalog} for a long list.
+@cindex Mailing list archives
+@cindex @code{help-gnuastro@@gnu.org}
+To ask a question from this mailing list, send a mail to
@code{help-gnuastro@@gnu.org}.
+Anyone can view the mailing list archives at
@url{http://lists.gnu.org/archive/html/help-gnuastro/}.
+It is best that before sending a mail, you search the archives to see if
anyone has asked a question similar to yours.
+If you want to make a suggestion or report a bug, please do not send a mail to
this mailing list.
+We have other mailing lists and tools for those purposes, see @ref{Report a
bug} or @ref{Suggest new feature}.
-As a summary, for each labeled region (or, galaxy) we have one @emph{row} and
for each measured property we have one @emph{column}.
-This high-level structure is usually the first step for higher-level analysis,
for example, finding the stellar mass or photometric redshift from magnitudes
in multiple colors.
-Thus, tables are not just outputs of programs, in fact it is much more common
for tables to be inputs of programs.
-For example, to make a mock galaxy image, you need to feed in the properties
of each galaxy into @ref{MakeProfiles} for it do the inverse of the process
above and make a simulated image from a catalog, see @ref{Sufi simulates a
detection}.
-In other cases, you can feed a table into @ref{Crop} and it will crop out
regions centered on the positions within the table, see @ref{Reddest clumps
cutouts and parallelization}.
-So to end this relatively long introduction, tables play a very important role
in astronomy, or generally all branches of data analysis.
-In @ref{Recognized table formats} the currently recognized table formats in
Gnuastro are discussed.
-You can use any of these tables as input or ask for them to be built as output.
-The most common type of table format is a simple plain text file with each row
on one line and columns separated by white space characters, this format is
easy to read/write by eye/hand.
-To give it the full functionality of more specific table types like the FITS
tables, Gnuastro has a special convention which you can use to give each column
a name, type, unit, and comments, while still being readable by other plain
text table readers.
-This convention is described in @ref{Gnuastro text table format}.
-When tables are input to a program, the program reading it needs to know which
column(s) it should use for its desired purposes.
-Gnuastro's programs all follow a similar convention, on the way you can select
columns in a table.
-They are thoroughly discussed in @ref{Selecting table columns}.
+@node Multi-threaded operations, Numeric data types, Getting help, Common
program behavior
+@section Multi-threaded operations
+
+@pindex nproc
+@cindex pthread
+@cindex CPU threads
+@cindex GNU Coreutils
+@cindex Using CPU threads
+@cindex CPU, using all threads
+@cindex Multi-threaded programs
+@cindex Using multiple CPU cores
+@cindex Simultaneous multithreading
+Some of the programs benefit significantly when you use all the threads your
computer's CPU has to offer to your operating system.
+The number of threads available can be larger than the number of physical
(hardware) cores in the CPU (also known as Simultaneous multithreading).
+For example, in Intel's CPUs (those that implement its Hyper-threading
technology) the number of threads is usually double the number of physical
cores in your CPU.
+On a GNU/Linux system, the number of threads available can be found with the
command @command{$ nproc} command (part of GNU Coreutils).
+
+@vindex --numthreads
+@cindex Number of threads available
+@cindex Available number of threads
+@cindex Internally stored option value
+Gnuastro's programs can find the number of threads available to your system
internally at run-time (when you execute the program).
+However, if a value is given to the @option{--numthreads} option, the given
number will be used, see @ref{Operating mode options} and @ref{Configuration
files} for ways to use this option.
+Thus @option{--numthreads} is the only common option in Gnuastro's programs
with a value that does not have to be specified anywhere on the command-line or
in the configuration files.
+
@menu
-* Recognized table formats:: Table formats that are recognized in Gnuastro.
-* Gnuastro text table format:: Gnuastro's convention plain text tables.
-* Selecting table columns:: Identify/select certain columns from a table
+* A note on threads:: Caution and suggestion on using threads.
+* How to run simultaneous operations:: How to run things simultaneously.
@end menu
-@node Recognized table formats, Gnuastro text table format, Tables, Tables
-@subsection Recognized table formats
+@node A note on threads, How to run simultaneous operations, Multi-threaded
operations, Multi-threaded operations
+@subsection A note on threads
-The list of table formats that Gnuastro can currently read from and write to
are described below.
-Each has their own advantage and disadvantages, so a short review of the
format is also provided to help you make the best choice based on how you want
to define your input tables or later use your output tables.
+@cindex Using multiple threads
+@cindex Best use of CPU threads
+@cindex Efficient use of CPU threads
+Spinning off threads is not necessarily the most efficient way to run an
application.
+Creating a new thread is not a cheap operation for the operating system.
+It is most useful when the input data are fixed and you want the same
operation to be done on parts of it.
+For example, one input image to Crop and multiple crops from various parts of
it.
+In this fashion, the image is loaded into memory once, all the crops are
divided between the number of threads internally and each thread cuts out those
parts which are assigned to it from the same image.
+On the other hand, if you have multiple images and you want to crop the same
region(s) out of all of them, it is much more efficient to set
@option{--numthreads=1} (so no threads spin off) and run Crop multiple times
simultaneously, see @ref{How to run simultaneous operations}.
-@table @asis
+@cindex Wall-clock time
+You can check the boost in speed by first running a program on one of the data
sets with the maximum number of threads and another time (with everything else
the same) and only using one thread.
+You will notice that the wall-clock time (reported by most programs at their
end) in the former is longer than the latter divided by number of physical CPU
cores (not threads) available to your operating system.
+Asymptotically these two times can be equal (most of the time they are not).
+So limiting the programs to use only one thread and running them independently
on the number of available threads will be more efficient.
-@item Plain text table
-This is the most basic and simplest way to create, view, or edit the table by
hand on a text editor.
-The other formats described below are less eye-friendly and have a more formal
structure (for easier computer readability).
-It is fully described in @ref{Gnuastro text table format}.
+@cindex System Cache
+@cindex Cache, system
+Note that the operating system keeps a cache of recently processed data, so
usually, the second time you process an identical data set (independent of the
number of threads used), you will get faster results.
+In order to make an unbiased comparison, you have to first clean the system's
cache with the following command between the two runs.
-@cindex FITS Tables
-@cindex Tables FITS
-@cindex ASCII table, FITS
-@item FITS ASCII tables
-The FITS ASCII table extension is fully in ASCII encoding and thus easily
readable on any text editor (assuming it is the only extension in the FITS
file).
-If the FITS file also contains binary extensions (for example, an image or
binary table extensions), then there will be many hard to print characters.
-The FITS ASCII format does not have new line characters to separate rows.
-In the FITS ASCII table standard, each row is defined as a fixed number of
characters (value to the @code{NAXIS1} keyword), so to visually inspect it
properly, you would have to adjust your text editor's width to this value.
-All columns start at given character positions and have a fixed width (number
of characters).
+@example
+$ sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
+@end example
-Numbers in a FITS ASCII table are printed into ASCII format, they are not in
binary (that the CPU uses).
-Hence, they can take a larger space in memory, loose their precision, and take
longer to read into memory.
-If you are dealing with integer type columns (see @ref{Numeric data types}),
another issue with FITS ASCII tables is that the type information for the
column will be lost (there is only one integer type in FITS ASCII tables).
-One problem with the binary format on the other hand is that it is not
portable (different CPUs/compilers) have different standards for translating
the zeros and ones.
-But since ASCII characters are defined on a byte and are well recognized, they
are better for portability on those various systems.
-Gnuastro's plain text table format described below is much more portable and
easier to read/write/interpret by humans manually.
+@cartouche
+@noindent
+@strong{SUMMARY: Should I use multiple threads?} Depends:
+@itemize
+@item
+If you only have @strong{one} data set (image in most cases!), then yes, the
more threads you use (with a maximum of the number of threads available to your
OS) the faster you will get your results.
-Generally, as the name implies, this format is useful for when your table
mainly contains ASCII columns (for example, file names, or descriptions).
-They can be useful when you need to include columns with structured ASCII
information along with other extensions in one FITS file.
-In such cases, you can also consider header keywords (see @ref{Fits}).
+@item
+If you want to run the same operation on @strong{multiple} data sets, it is
best to set the number of threads to 1 and use Make, or GNU Parallel, as
explained in @ref{How to run simultaneous operations}.
+@end itemize
+@end cartouche
-@cindex Binary table, FITS
-@item FITS binary tables
-The FITS binary table is the FITS standard's solution to the issues discussed
with keeping numbers in ASCII format as described under the FITS ASCII table
title above.
-Only columns defined as a string type (a string of ASCII characters) are
readable in a text editor.
-The portability problem with binary formats discussed above is mostly solved
thanks to the portability of CFITSIO (see @ref{CFITSIO}) and the very long
history of the FITS format which has been widely used since the 1970s.
-In the case of most numbers, storing them in binary format is more memory
efficient than ASCII format.
-For example, to store @code{-25.72034} in ASCII format, you need 9
bytes/characters.
-But if you keep this same number (to the approximate precision possible) as a
4-byte (32-bit) floating point number, you can keep/transmit it with less than
half the amount of memory.
-When catalogs contain thousands/millions of rows in tens/hundreds of columns,
this can lead to significant improvements in memory/band-width usage.
-Moreover, since the CPU does its operations in the binary formats, reading the
table in and writing it out is also much faster than an ASCII table.
-When you are dealing with integer numbers, the compression ratio can be even
better, for example, if you know all of the values in a column are positive and
less than @code{255}, you can use the @code{unsigned char} type which only
takes one byte! If they are between @code{-128} and @code{127}, then you can
use the (signed) @code{char} type.
-So if you are thoughtful about the limits of your integer columns, you can
greatly reduce the size of your file and also the speed at which it is
read/written.
-This can be very useful when sharing your results with collaborators or
publishing them.
-To decrease the file size even more you can name your output as ending in
@file{.fits.gz} so it is also compressed after creation.
-Just note that compression/decompressing is CPU intensive and can slow down
the writing/reading of the file.
-Fortunately the FITS Binary table format also accepts ASCII strings as column
types (along with the various numerical types).
-So your dataset can also contain non-numerical columns.
-@end table
+@node How to run simultaneous operations, , A note on threads, Multi-threaded
operations
+@subsection How to run simultaneous operations
-@menu
-* Gnuastro text table format:: Reading plain text tables
-@end menu
+There are two@footnote{A third way would be to open multiple terminal emulator
windows in your GUI, type the commands separately on each and press @key{Enter}
once on each terminal, but this is far too frustrating, tedious and prone to
errors.
+It's therefore not a realistic solution when tens, hundreds or thousands of
operations (your research targets, multiplied by the operations you do on each)
are to be done.} approaches to simultaneously execute a program: using GNU
Parallel or Make (GNU Make is the most common implementation).
+The first is very useful when you only want to do one job multiple times and
want to get back to your work without actually keeping the command you ran.
+The second is usually for more important operations, with lots of dependencies
between the different products (for example, a full scientific research).
-@node Gnuastro text table format, Selecting table columns, Recognized table
formats, Tables
-@subsection Gnuastro text table format
+@table @asis
-Plain text files are the most generic, portable, and easiest way to (manually)
create, (visually) inspect, or (manually) edit a table.
-In this format, the ending of a row is defined by the new-line character (a
line on a text editor).
-So when you view it on a text editor, every row will occupy one line.
-The delimiters (or characters separating the columns) are white space
characters (space, horizontal tab, vertical tab) and a comma (@key{,}).
-The only further requirement is that all rows/lines must have the same number
of columns.
+@item GNU Parallel
+@cindex GNU Parallel
+When you only want to run multiple instances of a command on different threads
and get on with the rest of your work, the best method is to use GNU parallel.
+Surprisingly GNU Parallel is one of the few GNU packages that has no Info
documentation but only a Man page, see @ref{Info}.
+So to see the documentation after installing it please run
-The columns do not have to be exactly under each other and the rows can be
arbitrarily long with different lengths.
-For example, the following contents in a file would be interpreted as a table
with 4 columns and 2 rows, with each element interpreted as a 64-bit floating
point type (see @ref{Numeric data types}).
+@example
+$ man parallel
+@end example
+@noindent
+As an example, let's assume we want to crop a region fixed on the pixels (500,
600) with the default width from all the FITS images in the @file{./data}
directory ending with @file{sci.fits} to the current directory.
+To do this, you can run:
@example
-1 2.234948 128 39.8923e8
-2 , 4.454 792 72.98348e7
+$ parallel astcrop --numthreads=1 --xc=500 --yc=600 ::: \
+ ./data/*sci.fits
@end example
-However, the example above has no other information about the columns (it is
just raw data, with no meta-data).
-To use this table, you have to remember what the numbers in each column
represent.
-Also, when you want to select columns, you have to count their position within
the table.
-This can become frustrating and prone to bad errors (getting the columns wrong
in your scientific project!) especially as the number of columns increase.
-It is also bad for sending to a colleague, because they will find it hard to
remember/use the columns properly.
+@noindent
+GNU Parallel can help in many more conditions, this is one of the simplest,
see the man page for lots of other examples.
+For absolute beginners: the backslash (@command{\}) is only a line breaker to
fit nicely in the page.
+If you type the whole command in one line, you should remove it.
-To solve these problems in Gnuastro's programs/libraries you are not limited
to using the column's number, see @ref{Selecting table columns}.
-If the columns have names, units, or comments you can also select your columns
based on searches/matches in these fields, for example, see @ref{Table}.
-Also, in this manner, you cannot guide the program reading the table on how to
read the numbers.
-As an example, the first and third columns above can be read as integer types:
the first column might be an ID and the third can be the number of pixels an
object occupies in an image.
-So there is no need to read these to columns as a 64-bit floating point type
(which takes more memory, and is slower).
+@item Make
+@cindex Make
+Make is a program for building ``targets'' (e.g., files) using ``recipes'' (a
set of operations) when their known ``prerequisites'' (other files) have been
updated.
+It elegantly allows you to define dependency structures for building your
final output and updating it efficiently when the inputs change.
+It is the most common infra-structure to build software today.
-In the bare-minimum example above, you also cannot use strings of characters,
for example, the names of filters, or some other identifier that includes
non-numerical characters.
-In the absence of any information, only numbers can be read robustly.
-Assuming we read columns with non-numerical characters as string, there would
still be the problem that the strings might contain space (or any delimiter)
character for some rows.
-So, each `word' in the string will be interpreted as a column and the program
will abort with an error that the rows do not have the same number of columns.
+Scientific research methodology is very similar to software development: you
start by testing a hypothesis on a small sample of objects/targets with a
simple set of steps.
+As you are able to get promising results, you improve the method and use it on
a larger, more general, sample.
+In the process, you will confront many issues that have to be corrected (bugs
in software development jargon).
+Make is a wonderful tool to manage this style of development.
-To correct for these limitations, Gnuastro defines the following convention
for storing the table meta-data along with the raw data in one plain text file.
-The format is primarily designed for ease of reading/writing by eye/fingers,
but is also structured enough to be read by a program.
+Besides the raw data analysis pipeline, Make has been used to for producing
reproducible papers, for example, see
@url{https://gitlab.com/makhlaghi/NoiseChisel-paper, the reproduction pipeline}
of the paper introducing @ref{NoiseChisel} (one of Gnuastro's programs).
+In fact the NoiseChisel paper's Make-based workflow was the foundation of a
parallel project called @url{http://maneage.org,Maneage} (@emph{Man}aging data
lin@emph{eage}): @url{http://maneage.org} that is described more fully in
Akhlaghi et al. @url{https://arxiv.org/abs/2006.03018, 2021}.
+Therefore, it is a very useful tool for complex scientific workflows.
-When the first non-white character in a line is @key{#}, or there are no
non-white characters in it, then the line will not be considered as a row of
data in the table (this is a pretty standard convention in many programs, and
higher level languages).
-In the first case (when the first character of the line is @key{#}), the line
is interpreted as a @emph{comment}.
+@cindex GNU Make
+GNU Make@footnote{@url{https://www.gnu.org/software/make/}} is the most common
implementation which (similar to nearly all GNU programs, comes with a
wonderful manual@footnote{@url{https://www.gnu.org/software/make/manual/}}).
+Make is very basic and simple, and thus the manual is short (the most
important parts are in the first roughly 100 pages) and easy to read/understand.
-If the comment line starts with `@code{# Column N:}', then it is assumed to
contain information about column @code{N} (a number, counting from 1).
-Comment lines that do not start with this pattern are ignored and you can use
them to include any further information you want to store with the table in the
text file.
-The most generic column information comment line has the following format:
+Make comes with a @option{--jobs} (@option{-j}) option which allows you to
specify the maximum number of jobs that can be done simultaneously.
+For example, if you have 8 threads available to your operating system.
+You can run:
@example
-# Column N: NAME [UNIT, TYPE(NUM), BLANK] COMMENT
+$ make -j8
@end example
-@cindex NaN
-@noindent
-Any sequence of characters between `@key{:}' and `@key{[}' will be interpreted
as the column name (so it can contain anything except the `@key{[}' character).
-Anything between the `@key{]}' and the end of the line is defined as a comment.
-Within the brackets, anything before the first `@key{,}' is the units
(physical units, for example, km/s, or erg/s), anything before the second
`@key{,}' is the short type identifier (see below, and @ref{Numeric data
types}).
+With this command, Make will process your @file{Makefile} and create all the
targets (can be thousands of FITS images for example) simultaneously on 8
threads, while fully respecting their dependencies (only building a file/target
when its prerequisites are successfully built).
+Make is thus strongly recommended for managing scientific research where
robustness, archiving, reproducibility and speed@footnote{Besides its
multi-threaded capabilities, Make will only re-build those targets that depend
on a change you have made, not the whole work.
+For example, if you have set the prerequisites properly, you can easily test
the changing of a parameter on your paper's results without having to re-do
everything (which is much faster).
+This allows you to be much more productive in easily checking various
ideas/assumptions of the different stages of your research and thus produce a
more robust result for your exciting science.} are important.
-If the type identifier is not recognized, the default 64-bit floating point
type will be used.
-The type identifier can optionally be followed by an integer within
parenthesis.
-If the parenthesis is present and the integer is larger than 1, the column is
assumed to be a ``vector column'' (which can have multiple values, for more see
@ref{Vector columns}).
+@end table
-Finally (still within the brackets), any non-white characters after the second
`@key{,}' are interpreted as the blank value for that column (see @ref{Blank
pixels}).
-The blank value can either be in the same type as the column (for example,
@code{-99} for a signed integer column), or any string (for example, @code{NaN}
in that same column).
-In both cases, the values will be stored in memory as Gnuastro's fixed blank
values for each type.
-For floating point types, Gnuastro's internal blank value is IEEE NaN
(Not-a-Number).
-For signed integers, it is the smallest possible value and for unsigned
integers its the largest possible value.
-When a formatting problem occurs, or when the column was already given
meta-data in a previous comment, or when the column number is larger than the
actual number of columns in the table (the non-commented or empty lines), then
the comment information line will be ignored.
-When a comment information line can be used, the leading and trailing white
space characters will be stripped from all of the elements.
-For example, in this line:
-@example
-# Column 5: column name [km/s, f32,-99] Redshift as speed
-@end example
-The @code{NAME} field will be `@code{column name}' and the @code{TYPE} field
will be `@code{f32}'.
-Note how all the white space characters before and after strings are not used,
but those in the middle remained.
-Also, white space characters are not mandatory.
-Hence, in the example above, the @code{BLANK} field will be given the value of
`@code{-99}'.
+@node Numeric data types, Memory management, Multi-threaded operations, Common
program behavior
+@section Numeric data types
-Except for the column number (@code{N}), the rest of the fields are optional.
-Also, the column information comments do not have to be in order.
-In other words, the information for column @mymath{N+m} (@mymath{m>0}) can be
given in a line before column @mymath{N}.
-Furthermore, you do not have to specify information for all columns.
-Those columns that do not have this information will be interpreted with the
default settings (like the case above: values are double precision floating
point, and the column has no name, unit, or comment).
-So these lines are all acceptable for any table (the first one, with nothing
but the column number is redundant):
+@cindex Bit
+@cindex Type
+At the lowest level, the computer stores everything in terms of @code{1} or
@code{0}.
+For example, each program in Gnuastro, or each astronomical image you take
with the telescope is actually a string of millions of these zeros and ones.
+The space required to keep a zero or one is the smallest unit of storage, and
is known as a @emph{bit}.
+However, understanding and manipulating this string of bits is extremely hard
for most people.
+Therefore, different standards are defined to package the bits into separate
@emph{type}s with a fixed interpretation of the bits in each package.
-@example
-# Column 5:
-# Column 1: ID [,i8] The Clump ID.
-# Column 3: mag_f160w [AB mag, f32] Magnitude from the F160W filter
-@end example
+@cindex Byte
+@cindex Signed integer
+@cindex Unsigned integer
+@cindex Integer, Signed
+To store numbers, the most basic standard/type is for integers (@mymath{...,
-2, -1, 0, 1, 2, ...}).
+The common integer types are 8, 16, 32, and 64 bits wide (more bits will give
larger limits).
+Each bit corresponds to a power of 2 and they are summed to create the final
number.
+In the integer types, for each width there are two standards for reading the
bits: signed and unsigned.
+In the `signed' convention, one bit is reserved for the sign (stating that the
integer is positive or negative).
+The `unsigned' integers use that bit in the actual number and thus contain
only positive numbers (starting from zero).
-@noindent
-The data type of the column should be specified with one of the following
values:
+Therefore, at the same number of bits, both signed and unsigned integers can
allow the same number of integers, but the positive limit of the
@code{unsigned} types is double their @code{signed} counterparts with the same
width (at the expense of not having negative numbers).
+When the context of your work does not involve negative numbers (for example,
counting, where negative is not defined), it is best to use the @code{unsigned}
types.
+For the full numerical range of all integer types, see below.
-@itemize
-@item
-For a numeric column, you can use any of the numeric types (and their
-recognized identifiers) described in @ref{Numeric data types}.
-@item
-`@code{strN}': for strings.
-The @code{N} value identifies the length of the string (how many characters it
has).
-The start of the string on each row is the first non-delimiter character of
the column that has the string type.
-The next @code{N} characters will be interpreted as a string and all leading
and trailing white space will be removed.
+Another standard of converting a given number of bits to numbers is the
floating point standard, this standard can @emph{approximately} store any real
number with a given precision.
+There are two common floating point types: 32-bit and 64-bit, for single and
double precision floating point numbers respectively.
+The former is sufficient for data with less than 8 significant decimal digits
(most astronomical data), while the latter is good for less than 16 significant
decimal digits.
+The representation of real numbers as bits is much more complex than integers.
+If you are interested to learn more about it, you can start with the
@url{https://en.wikipedia.org/wiki/Floating_point, Wikipedia article}.
-If the next column's characters, are closer than @code{N} characters to the
start of the string column in that line/row, they will be considered part of
the string column.
-If there is a new-line character before the ending of the space given to the
string column (in other words, the string column is the last column), then
reading of the string will stop, even if the @code{N} characters are not
complete yet.
-See @file{tests/table/table.txt} for one example.
-Therefore, the only time you have to pay attention to the positioning and
spaces given to the string column is when it is not the last column in the
table.
+Practically, you can use Gnuastro's Arithmetic program to convert/change the
type of an image/datacube (see @ref{Arithmetic}), or Gnuastro Table program to
convert a table column's data type (see @ref{Column arithmetic}).
+Conversion of a dataset's type is necessary in some contexts.
+For example, the program/library, that you intend to feed the data into, only
accepts floating point values, but you have an integer image/column.
+Another situation that conversion can be helpful is when you know that your
data only has values that fit within @code{int8} or @code{uint16}.
+However it is currently formatted in the @code{float64} type.
-The only limitation in this format is that trailing and leading white space
characters will be removed from the columns that are read.
-In most cases, this is the desired behavior, but if trailing and leading
white-spaces are critically important to your analysis, define your own
starting and ending characters and remove them after the table has been read.
-For example, in the sample table below, the two `@key{|}' characters (which
are arbitrary) will remain in the value of the second column and you can remove
them manually later.
-If only one of the leading or trailing white spaces is important for your
work, you can only use one of the `@key{|}'s.
+The important thing to consider is that operations involving wider, floating
point, or signed types can be significantly slower than smaller-width, integer,
or unsigned types respectively.
+Note that besides speed, a wider type also requires much more storage space
(by 4 or 8 times).
+Therefore, when you confront such situations that can be optimized and want to
store/archive/transfer the data, it is best to use the most efficient type.
+For example, if your dataset (image or table column) only has positive
integers less than 65535, store it as an unsigned 16-bit integer for faster
processing, faster transfer, and less storage space.
-@example
-# Column 1: ID [label, u8]
-# Column 2: Notes [no unit, str50]
-1 leading and trailing white space is ignored here 2.3442e10
-2 | but they will be preserved here | 8.2964e11
-@end example
+The short and long names for the recognized numeric data types in Gnuastro are
listed below.
+Both short and long names can be used when you want to specify a type.
+For example, as a value to the common option @option{--type} (see @ref{Input
output options}), or in the information comment lines of @ref{Gnuastro text
table format}.
+The ranges listed below are inclusive.
-@end itemize
+@table @code
+@item u8
+@itemx uint8
+8-bit unsigned integers, range:@*
+@mymath{[0\rm{\ to\ }2^8-1]} or @mymath{[0\rm{\ to\ }255]}.
-Note that the FITS binary table standard does not define the @code{unsigned
int} and @code{unsigned long} types, so if you want to convert your tables to
FITS binary tables, use other types.
-Also, note that in the FITS ASCII table, there is only one integer type
(@code{long}).
-So if you convert a Gnuastro plain text table to a FITS ASCII table with the
@ref{Table} program, the type information for integers will be lost.
-Conversely if integer types are important for you, you have to manually set
them when reading a FITS ASCII table (for example, with the Table program when
reading/converting into a file, or with the @file{gnuastro/table.h} library
functions when reading into memory).
+@item i8
+@itemx int8
+8-bit signed integers, range:@*
+@mymath{[-2^7\rm{\ to\ }2^7-1]} or @mymath{[-128\rm{\ to\ }127]}.
+@item u16
+@itemx uint16
+16-bit unsigned integers, range:@*
+@mymath{[0\rm{\ to\ }2^{16}-1]} or @mymath{[0\rm{\ to\ }65535]}.
-@node Selecting table columns, , Gnuastro text table format, Tables
-@subsection Selecting table columns
+@item i16
+@itemx int16
+16-bit signed integers, range:@* @mymath{[-2^{15}\rm{\ to\ }2^{15}-1]} or
+@mymath{[-32768\rm{\ to\ }32767]}.
-At the lowest level, the only defining aspect of a column in a table is its
number, or position.
-But selecting columns purely by number is not very convenient and, especially
when the tables are large it can be very frustrating and prone to errors.
-Hence, table file formats (for example, see @ref{Recognized table formats})
have ways to store additional information about the columns (meta-data).
-Some of the most common pieces of information about each column are its
@emph{name}, the @emph{units} of data in it, and a @emph{comment} for
longer/informal description of the column's data.
+@item u32
+@itemx uint32
+32-bit unsigned integers, range:@* @mymath{[0\rm{\ to\ }2^{32}-1]} or
+@mymath{[0\rm{\ to\ }4294967295]}.
-To facilitate research with Gnuastro, you can select columns by matching, or
searching in these three fields, besides the low-level column number.
-To view the full list of information on the columns in the table, you can use
the Table program (see @ref{Table}) with the command below (replace
@file{table-file} with the filename of your table, if its FITS, you might also
need to specify the HDU/extension which contains the table):
+@item i32
+@itemx int32
+32-bit signed integers, range:@* @mymath{[-2^{31}\rm{\ to\ }2^{31}-1]} or
+@mymath{[-2147483648\rm{\ to\ }2147483647]}.
-@example
-$ asttable --information table-file
-@end example
+@item u64
+@itemx uint64
+64-bit unsigned integers, range@* @mymath{[0\rm{\ to\ }2^{64}-1]} or
+@mymath{[0\rm{\ to\ }18446744073709551615]}.
-Gnuastro's programs need the columns for different purposes, for example, in
Crop, you specify the columns containing the central coordinates of the crop
centers with the @option{--coordcol} option (see @ref{Crop options}).
-On the other hand, in MakeProfiles, to specify the column containing the
profile position angles, you must use the @option{--pcol} option (see
@ref{MakeProfiles catalog}).
-Thus, there can be no unified common option name to select columns for all
programs (different columns have different purposes).
-However, when the program expects a column for a specific context, the option
names end in the @option{col} suffix like the examples above.
-These options accept values in integer (column number), or string (metadata
match/search) format.
+@item i64
+@itemx int64
+64-bit signed integers, range:@* @mymath{[-2^{63}\rm{\ to\ }2^{63}-1]} or
+@mymath{[-9223372036854775808\rm{\ to\ }9223372036854775807]}.
-If the value can be parsed as a positive integer, it will be seen as the
low-level column number.
-Note that column counting starts from 1, so if you ask for column 0, the
respective program will abort with an error.
-When the value cannot be interpreted as an a integer number, it will be seen
as a string of characters which will be used to match/search in the table's
meta-data.
-The meta-data field which the value will be compared with can be selected
through the @option{--searchin} option, see @ref{Input output options}.
-@option{--searchin} can take three values: @code{name}, @code{unit},
@code{comment}.
-The matching will be done following this convention:
+@item f32
+@itemx float32
+32-bit (single-precision) floating point types.
+The maximum (minimum is its negative) possible value is
@mymath{3.402823\times10^{38}}.
+Single-precision floating points can accurately represent a floating point
number up to @mymath{\sim7.2} significant decimals.
+Given the heavy noise in astronomical data, this is usually more than
sufficient for storing results.
+For more, see @ref{Printing floating point numbers}.
-@itemize
-@item
-If the value is enclosed in two slashes (for example, @command{-x/RA_/}, or
@option{--coordcol=/RA_/}, see @ref{Crop options}), then it is assumed to be a
regular expression with the same convention as GNU AWK.
-GNU AWK has a very well written
@url{https://www.gnu.org/software/gawk/manual/html_node/Regexp.html, chapter}
describing regular expressions, so we will not continue discussing them here.
-Regular expressions are a very powerful tool in matching text and useful in
many contexts.
-We thus strongly encourage reviewing this chapter for greatly improving the
quality of your work in many cases, not just for searching column meta-data in
Gnuastro.
+@item f64
+@itemx float64
+64-bit (double-precision) floating point types.
+The maximum (minimum is its negative) possible value is @mymath{\sim10^{308}}.
+Double-precision floating points can accurately represent a floating point
number @mymath{\sim15.9} significant decimals.
+This is usually good for processing (mixing) the data internally, for example,
a sum of single precision data (and later storing the result as @code{float32}).
+For more, see @ref{Printing floating point numbers}.
+@end table
-@item
-When the string is not enclosed between `@key{/}'s, any column that exactly
matches the given value in the given field will be selected.
-@end itemize
+@cartouche
+@noindent
+@strong{Some file formats do not recognize all types.} for example, the FITS
standard (see @ref{Fits}) does not define @code{uint64} in binary tables or
images.
+When a type is not acceptable for output into a given file format, the
respective Gnuastro program or library will let you know and abort.
+On the command-line, you can convert the numerical type of an image, or table
column into another type with @ref{Arithmetic} or @ref{Table} respectively.
+If you are writing your own program, you can use the
@code{gal_data_copy_to_new_type()} function in Gnuastro's library, see
@ref{Copying datasets}.
+@end cartouche
-Note that in both cases, you can ignore the case of alphabetic characters with
the @option{--ignorecase} option, see @ref{Input output options}.
-Also, in both cases, multiple columns may be selected with one call to this
function.
-In this case, the order of the selected columns (with one call) will be the
same order as they appear in the table.
+@node Memory management, Tables, Numeric data types, Common program behavior
+@section Memory management
+@cindex Memory management
+@cindex Non-volatile memory
+@cindex Memory, non-volatile
+In this section we will review how Gnuastro manages your input data in your
system's memory.
+Knowing this can help you optimize your usage (in speed and memory
consumption) when the data volume is large and approaches, or exceeds, your
available RAM (usually in various calls to multiple programs simultaneously).
+But before diving into the details, let's have a short basic introduction to
memory in general and in particular the types of memory most relevant to this
discussion.
+Input datasets (that are later fed into programs for analysis) are commonly
first stored in @emph{non-volatile memory}.
+This is a type of memory that does not need a constant power supply to keep
the data and is therefore primarily aimed for long-term storage, like HDDs or
SSDs.
+So data in this type of storage is preserved when you turn off your computer.
+But by its nature, non-volatile memory is much slower, in reading or writing,
than the speeds that CPUs can process the data.
+Thus relying on this type of memory alone would create a bad bottleneck in the
input/output (I/O) phase of any processing.
-@node Tessellation, Automatic output, Tables, Common program behavior
-@section Tessellation
-
-It is sometimes necessary to classify the elements in a dataset (for example,
pixels in an image) into a grid of individual, non-overlapping tiles.
-For example, when background sky gradients are present in an image, you can
define a tile grid over the image.
-When the tile sizes are set properly, the background's variation over each
tile will be negligible, allowing you to measure (and subtract) it.
-In other cases (for example, spatial domain convolution in Gnuastro, see
@ref{Convolve}), it might simply be for speed of processing: each tile can be
processed independently on a separate CPU thread.
-In the arts and mathematics, this process is formally known as
@url{https://en.wikipedia.org/wiki/Tessellation, tessellation}.
+@cindex RAM
+@cindex Volatile memory
+@cindex Memory, volatile
+The first step to decrease this bottleneck is to have a faster storage space,
but with a much limited storage volume.
+For this type of storage, computers have a Random Access Memory (or RAM).
+RAM is classified as a @emph{volatile memory} because it needs a constant flow
of electricity to keep the information.
+In other words, the moment power is cut-off, all the stored information in
your RAM is gone (hence the ``volatile'' name).
+But thanks to that constant supply of power, it can access any random address
with equal (and very high!) speed.
-The size of the regular tiles (in units of data-elements, or pixels in an
image) can be defined with the @option{--tilesize} option.
-It takes multiple numbers (separated by a comma) which will be the length
along the respective dimension (in FORTRAN/FITS dimension order).
-Divisions are also acceptable, but must result in an integer.
-For example, @option{--tilesize=30,40} can be used for an image (a 2D dataset).
-The regular tile size along the first FITS axis (horizontal when viewed in SAO
DS9) will be 30 pixels and along the second it will be 40 pixels.
-Ideally, @option{--tilesize} should be selected such that all tiles in the
image have exactly the same size.
-In other words, that the dataset length in each dimension is divisible by the
tile size in that dimension.
+Hence, the general/simplistic way that programs deal with memory is the
following (this is general to almost all programs, not just Gnuastro's):
+1) Load/copy the input data from the non-volatile memory into RAM.
+2) Use the copy of the data in RAM as input for all the internal processing as
well as the intermediate data that is necessary during the processing.
+3) Finally, when the analysis is complete, write the final output data back
into non-volatile memory, and free/delete all the used space in the RAM (the
initial copy and all the intermediate data).
+Usually the RAM is most important for the data of the intermediate steps (that
you never see as a user of a program!).
-However, this is not always possible: the dataset can be any size and every
pixel in it is valuable.
-In such cases, Gnuastro will look at the significance of the remainder length,
if it is not significant (for example, one or two pixels), then it will just
increase the size of the first tile in the respective dimension and allow the
rest of the tiles to have the required size.
-When the remainder is significant (for example, one pixel less than the size
along that dimension), the remainder will be added to one regular tile's size
and the large tile will be cut in half and put in the two ends of the
grid/tessellation.
-In this way, all the tiles in the central regions of the dataset will have the
regular tile sizes and the tiles on the edge will be slightly larger/smaller
depending on the remainder significance.
-The fraction which defines the remainder significance along all dimensions can
be set through @option{--remainderfrac}.
+When the input dataset(s) to a program are small (compared to the available
space in your system's RAM at the moment it is run) Gnuastro's programs and
libraries follow the standard series of steps above.
+The only exception is that deleting the intermediate data is not only done at
the end of the program.
+As soon as an intermediate dataset is no longer necessary for the next
internal steps, the space it occupied is deleted/freed.
+This allows Gnuastro programs to minimize their usage of your system's RAM
over the full running time.
-The best tile size is directly related to the spatial properties of the
property you want to study (for example, gradient on the image).
-In practice we assume that the gradient is not present over each tile.
-So if there is a strong gradient (for example, in long wavelength ground based
images) or the image is of a crowded area where there is not too much blank
area, you have to choose a smaller tile size.
-A larger mesh will give more pixels and so the scatter in the results will be
less (better statistics).
+The situation gets complicated when the datasets are large (compared to your
available RAM when the program is run).
+For example, if a dataset is half the size of your system's available RAM, and
the program's internal analysis needs three or more intermediately processed
copies of it at one moment in its analysis.
+There will not be enough RAM to keep those higher-level intermediate data.
+In such cases, programs that do not do any memory management will crash.
+But fortunately Gnuastro's programs do have a memory management plans for such
situations.
-@cindex CCD
-@cindex Amplifier
-@cindex Bias current
-@cindex Subaru Telescope
-@cindex Hyper Suprime-Cam
-@cindex Hubble Space Telescope (HST)
-For raw image processing, a single tessellation/grid is not sufficient.
-Raw images are the unprocessed outputs of the camera detectors.
-Modern detectors usually have multiple readout channels each with its own
amplifier.
-For example, the Hubble Space Telescope Advanced Camera for Surveys (ACS) has
four amplifiers over its full detector area dividing the square field of view
to four smaller squares.
-Ground based image detectors are not exempt, for example, each CCD of Subaru
Telescope's Hyper Suprime-Cam camera (which has 104 CCDs) has four amplifiers,
but they have the same height of the CCD and divide the width by four parts.
+@cindex Memory-mapped file
+When the necessary amount of space for an intermediate dataset cannot be
allocated in the RAM, Gnuastro's programs will not use the RAM at all.
+They will use the ``memory-mapped file'' concept in modern operating systems
to create a randomly-named file in your non-volatile memory and use that
instead of the RAM.
+That file will have the exact size (in bytes) of that intermediate dataset.
+Any time the program needs that intermediate dataset, the operating system
will directly go to that file, and bypass your RAM.
+As soon as that file is no longer necessary for the analysis, it will be
deleted.
+But as mentioned above, non-volatile memory has much slower I/O speed than the
RAM.
+Hence in such situations, the programs will become noticeably slower
(sometimes by factors of 10 times slower, depending on your non-volatile memory
speed).
-@cindex Channel
-The bias current on each amplifier is different, and initial bias subtraction
is not perfect.
-So even after subtracting the measured bias current, you can usually still
identify the boundaries of different amplifiers by eye.
-See Figure 11(a) in Akhlaghi and Ichikawa (2015) for an example.
-This results in the final reduced data to have non-uniform amplifier-shaped
regions with higher or lower background flux values.
-Such systematic biases will then propagate to all subsequent measurements we
do on the data (for example, photometry and subsequent stellar mass and star
formation rate measurements in the case of galaxies).
+Because of the drop in I/O speed (and thus the speed of your running program),
the moment that any to-be-allocated dataset is memory-mapped, Gnuastro's
programs and libraries will notify you with a descriptive statement like below
(can happen in any phase of their analysis).
+It shows the location of the memory-mapped file, its size, complemented with a
small description of the cause, a pointer to this section of the book for more
information on how to deal with it (if necessary), and what to do to suppress
it.
-Therefore an accurate analysis requires a two layer tessellation: the top
layer contains larger tiles, each covering one amplifier channel.
-For clarity we will call these larger tiles ``channels''.
-The number of channels along each dimension is defined through the
@option{--numchannels}.
-Each channel is then covered by its own individual smaller tessellation (with
tile sizes determined by the @option{--tilesize} option).
-This will allow independent analysis of two adjacent pixels from different
channels if necessary.
-If the image is processed or the detector only has one amplifier, you can set
the number of channels in both dimension to 1.
+@example
+astarithmetic: ./gnuastro_mmap/Fu7Dhs: temporary memory-mapped file
+(XXXXXXXXXXX bytes) created for intermediate data that is not stored
+in RAM (see the "Memory management" section of Gnuastro's manual for
+optimizing your project's memory management, and thus speed). To
+disable this warning, please use the option '--quiet-mmap'
+@end example
-The final tessellation can be inspected on the image with the
@option{--checktiles} option that is available to all programs which use
tessellation for localized operations.
-When this option is called, a FITS file with a @file{_tiled.fits} suffix will
be created along with the outputs, see @ref{Automatic output}.
-Each pixel in this image has the number of the tile that covers it.
-If the number of channels in any dimension are larger than unity, you will
notice that the tile IDs are defined such that the first channels is covered
first, then the second and so on.
-For the full list of processing-related common options (including tessellation
options), please see @ref{Processing options}.
+@noindent
+Finally, when the intermediate dataset is no longer necessary, the program
will automatically delete it and notify you with a statement like this:
+@example
+astarithmetic: ./gnuastro_mmap/Fu7Dhs: deleted
+@end example
+@noindent
+To disable these messages, you can run the program with @code{--quietmmap}, or
set the @code{quietmmap} variable in the allocating library function to be
non-zero.
+An important component of these messages is the name of the memory-mapped file.
+Knowing that the file has been deleted is important for the user if the
program crashes for any reason: internally (for example, a parameter is given
wrongly) or externally (for example, you mistakenly kill the running job).
+In the event of a crash, the memory-mapped files will not be deleted and you
have to manually delete them because they are usually large and they may soon
fill your full storage if not deleted in a long time due to successive crashes.
+This brings us to managing the memory-mapped files in your non-volatile memory.
+In other words: knowing where they are saved, or intentionally placing them in
different places of your file system, or deleting them when necessary.
+As the examples above show, memory-mapped files are stored in a sub-directory
of the running directory called @file{gnuastro_mmap}.
+If this directory does not exist, Gnuastro will automatically create it when
memory mapping becomes necessary.
+Alternatively, it may happen that the @file{gnuastro_mmap} sub-directory
exists and is not writable, or it cannot be created.
+In such cases, the memory-mapped file for each dataset will be created in the
running directory with a @file{gnuastro_mmap_} prefix.
-@node Automatic output, Output FITS files, Tessellation, Common program
behavior
-@section Automatic output
+Therefore one easy way to delete all memory-mapped files in case of a crash,
is to delete everything within the sub-directory (first command below), or all
files stating with this prefix:
-@cindex Standard input
-@cindex Automatic output file names
-@cindex Output file names, automatic
-@cindex Setting output file names automatically
-All the programs in Gnuastro are designed such that specifying an output file
or directory (based on the program context) is optional.
-When no output name is explicitly given (with @option{--output}, see
@ref{Input output options}), the programs will automatically set an output name
based on the input name(s) and what the program does.
-For example, when you are using ConvertType to save FITS image named
@file{dataset.fits} to a JPEG image and do not specify a name for it, the JPEG
output file will be name @file{dataset.jpg}.
-When the input is from the standard input (for example, a pipe, see
@ref{Standard input}), and @option{--output} is not given, the output name will
be the program's name (for example, @file{converttype.jpg}).
+@example
+rm -f gnuastro_mmap/*
+rm -f gnuastro_mmap_*
+@end example
-@vindex --keepinputdir
-Another very important part of the automatic output generation is that all the
directory information of the input file name is stripped off of it.
-This feature can be disabled with the @option{--keepinputdir} option, see
@ref{Input output options}.
-It is the default because astronomical data are usually very large and
organized specially with special file names.
-In some cases, the user might not have write permissions in those
directories@footnote{In fact, even if the data is stored on your own computer,
it is advised to only grant write permissions to the super user or root.
-This way, you will not accidentally delete or modify your valuable data!}.
+A much more common issue when dealing with memory-mapped files is their
location.
+For example, you may be running a program in a partition that is hosted by an
HDD.
+But you also have another partition on an SSD (which has much faster I/O).
+So you want your memory-mapped files to be created in the SSD to speed up your
processing.
+In this scenario, you want your project source directory to only contain your
plain-text scripts and you want your project's built products (even the
temporary memory-mapped files) to be built in a different location because they
are large; thus I/O speed becomes important.
-Let's assume that we are working on a report and want to process the FITS
images from two projects (ABC and DEF), which are stored in the sub-directories
named @file{ABCproject/} and @file{DEFproject/} of our top data directory
(@file{/mnt/data}).
-The following shell commands show how one image from the former is first
converted to a JPEG image through ConvertType and then the objects from an
image in the latter project are detected using NoiseChisel.
-The text after the @command{#} sign are comments (not typed!).
+To host the memory-mapped files in another location (with fast I/O), you can
set (@file{gnuastro_mmap}) to be a symbolic link to it.
+For example, let's assume you want your memory-mapped files to be stored in
@file{/path/to/dir/for/mmap}.
+All you have to do is to run the following command before your Gnuastro
analysis command(s).
@example
-$ pwd # Current location
-/home/usrname/research/report
-$ ls # List directory contents
-ABC01.jpg
-$ ls /mnt/data/ABCproject # Archive 1
-ABC01.fits ABC02.fits ABC03.fits
-$ ls /mnt/data/DEFproject # Archive 2
-DEF01.fits DEF02.fits DEF03.fits
-$ astconvertt /mnt/data/ABCproject/ABC02.fits --output=jpg # Prog 1
-$ ls
-ABC01.jpg ABC02.jpg
-$ astnoisechisel /mnt/data/DEFproject/DEF01.fits # Prog 2
-$ ls
-ABC01.jpg ABC02.jpg DEF01_detected.fits
+ln -s /path/to/dir/for/mmap gnuastro_mmap
@end example
+The programs will delete a memory-mapped file when it is no longer needed, but
they will not delete the @file{gnuastro_mmap} directory that hosts them.
+So if your project involves many Gnuastro programs (possibly called in
parallel) and you want your memory-mapped files to be in a different location,
you just have to make the symbolic link above once at the start, and all the
programs will use it if necessary.
+Another memory-management scenario that may happen is this: you do not want a
Gnuastro program to allocate internal datasets in the RAM at all.
+For example, the speed of your Gnuastro-related project does not matter at
that moment, and you have higher-priority jobs that are being run at the same
time which need to have RAM available.
+In such cases, you can use the @option{--minmapsize} option that is available
in all Gnuastro programs (see @ref{Processing options}).
+Any intermediate dataset that has a size larger than the value of this option
will be memory-mapped, even if there is space available in your RAM.
+For example, if you want any dataset larger than 100 megabytes to be
memory-mapped, use @option{--minmapsize=100000000} (8 zeros!).
+@cindex Linux kernel
+@cindex Kernel, Linux
+You should not set the value of @option{--minmapsize} to be too small,
otherwise even small intermediate values (that are usually very numerous) in
the program will be memory-mapped.
+However the kernel can only host a limited number of memory-mapped files at
every moment (by all running programs combined).
+For example, in the default@footnote{If you need to host more memory-mapped
files at one moment, you need to build your own customized Linux kernel.} Linux
kernel on GNU/Linux operating systems this limit is roughly 64000.
+If the total number of memory-mapped files exceeds this number, all the
programs using them will crash.
+Gnuastro's programs will warn you if your given value is too small and may
cause a problem later.
+Actually, the default behavior for Gnuastro's programs (to only use
memory-mapped files when there is not enough RAM) is a side-effect of
@option{--minmapsize}.
+The pre-defined value to this option is an extremely large value in the
lowest-level Gnuastro configuration file (the installed @file{gnuastro.conf}
described in @ref{Configuration file precedence}).
+This value is larger than the largest possible available RAM.
+You can check by running any Gnuastro program with a @option{-P} option.
+Because no dataset will be larger than this, by default the programs will
first attempt to use the RAM for temporary storage.
+But if writing in the RAM fails (for any reason, mainly due to lack of
available space), then a memory-mapped file will be created.
-@node Output FITS files, Numeric locale, Automatic output, Common program
behavior
-@section Output FITS files
-@cindex FITS
-@cindex Output FITS headers
-@cindex CFITSIO version on outputs
-The output of many of Gnuastro's programs are (or can be) FITS files.
-The FITS format has many useful features for storing scientific datasets
(cubes, images and tables) along with a robust features for archivability.
-For more on this standard, please see @ref{Fits}.
-As a community convention described in @ref{Fits}, the first extension of all
FITS files produced by Gnuastro's programs only contains the meta-data that is
intended for the file's extension(s).
-For a Gnuastro program, this generic meta-data (that is stored as FITS keyword
records) is its configuration when it produced this dataset: file name(s) of
input(s) and option names, values and comments.
-Note that when the configuration is too trivial (only input filename, for
example, the program @ref{Table}) no meta-data is written in this extension.
-FITS keywords have the following limitations in regards to generic option
names and values which are described below:
-@itemize
-@item
-If a keyword (option name) is longer than 8 characters, the first word in the
record (80 character line) is @code{HIERARCH} which is followed by the keyword
name.
+@node Tables, Tessellation, Memory management, Common program behavior
+@section Tables
-@item
-Values can be at most 75 characters, but for strings, this changes to 73
(because of the two extra @key{'} characters that are necessary).
-However, if the value is a file name, containing slash (@key{/}) characters to
separate directories, Gnuastro will break the value into multiple keywords.
+``A table is a collection of related data held in a structured format within a
database.
+It consists of columns, and rows.'' (from Wikipedia).
+Each column in the table contains the values of one property and each row is a
collection of properties (columns) for one target object.
+For example, let's assume you have just ran MakeCatalog (see
@ref{MakeCatalog}) on an image to measure some properties for the labeled
regions (which might be detected galaxies for example) in the image.
+For each labeled region (detected galaxy), there will be a @emph{row} which
groups its measured properties as @emph{columns}, one column for each property.
+One such property can be the object's magnitude, which is the sum of pixels
with that label, or its center can be defined as the light-weighted average
value of those pixels.
+Many such properties can be derived from the raw pixel values and their
position, see @ref{Invoking astmkcatalog} for a long list.
-@item
-Keyword names ignore case, therefore they are all in capital letters.
-Therefore, if you want to use Grep to inspect these keywords, use the
@option{-i} option, like the example below.
+As a summary, for each labeled region (or, galaxy) we have one @emph{row} and
for each measured property we have one @emph{column}.
+This high-level structure is usually the first step for higher-level analysis,
for example, finding the stellar mass or photometric redshift from magnitudes
in multiple colors.
+Thus, tables are not just outputs of programs, in fact it is much more common
for tables to be inputs of programs.
+For example, to make a mock galaxy image, you need to feed in the properties
of each galaxy into @ref{MakeProfiles} for it do the inverse of the process
above and make a simulated image from a catalog, see @ref{Sufi simulates a
detection}.
+In other cases, you can feed a table into @ref{Crop} and it will crop out
regions centered on the positions within the table, see @ref{Reddest clumps
cutouts and parallelization}.
+So to end this relatively long introduction, tables play a very important role
in astronomy, or generally all branches of data analysis.
-@example
-$ astfits image_detected.fits -h0 | grep -i snquant
-@end example
-@end itemize
+In @ref{Recognized table formats} the currently recognized table formats in
Gnuastro are discussed.
+You can use any of these tables as input or ask for them to be built as output.
+The most common type of table format is a simple plain text file with each row
on one line and columns separated by white space characters, this format is
easy to read/write by eye/hand.
+To give it the full functionality of more specific table types like the FITS
tables, Gnuastro has a special convention which you can use to give each column
a name, type, unit, and comments, while still being readable by other plain
text table readers.
+This convention is described in @ref{Gnuastro text table format}.
-The keywords above are classified (separated by an empty line and title) as a
group titled ``ProgramName configuration''.
-This meta-data extension, as well as all the other extensions (which contain
data), also contain have final group of keywords to keep the basic date and
version information of Gnuastro, its dependencies and the pipeline that is
using Gnuastro (if it is under version control).
+When tables are input to a program, the program reading it needs to know which
column(s) it should use for its desired purposes.
+Gnuastro's programs all follow a similar convention, on the way you can select
columns in a table.
+They are thoroughly discussed in @ref{Selecting table columns}.
-@table @command
-@item DATE
-The creation time of the FITS file.
-This date is written directly by CFITSIO and is in UT format.
+@menu
+* Recognized table formats:: Table formats that are recognized in Gnuastro.
+* Gnuastro text table format:: Gnuastro's convention plain text tables.
+* Selecting table columns:: Identify/select certain columns from a table
+@end menu
-@item COMMIT
-Git's commit description from the running directory of Gnuastro's programs.
-If the running directory is not version controlled or @file{libgit2} is not
installed (see @ref{Optional dependencies}) then this keyword will not be
present.
-The printed value is equivalent to the output of the following command:
+@node Recognized table formats, Gnuastro text table format, Tables, Tables
+@subsection Recognized table formats
-@example
-git describe --dirty --always
-@end example
+The list of table formats that Gnuastro can currently read from and write to
are described below.
+Each has their own advantage and disadvantages, so a short review of the
format is also provided to help you make the best choice based on how you want
to define your input tables or later use your output tables.
-If the running directory contains non-committed work, then the stored value
will have a `@command{-dirty}' suffix.
-This can be very helpful to let you know that the data is not ready to be
shared with collaborators or submitted to a journal.
-You should only share results that are produced after all your work is
committed (safely stored in the version controlled history and thus
reproducible).
+@table @asis
-At first sight, version control appears to be mainly a tool for software
developers.
-However progress in a scientific research is almost identical to progress in
software development: first you have a rough idea that starts with handful of
easy steps.
-But as the first results appear to be promising, you will have to extend, or
generalize, it to make it more robust and work in all the situations your
research covers, not just your first test samples.
-Slowly you will find wrong assumptions or bad implementations that need to be
fixed (`bugs' in software development parlance).
-Finally, when you submit the research to your collaborators or a journal, many
comments and suggestions will come in, and you have to address them.
+@item Plain text table
+This is the most basic and simplest way to create, view, or edit the table by
hand on a text editor.
+The other formats described below are less eye-friendly and have a more formal
structure (for easier computer readability).
+It is fully described in @ref{Gnuastro text table format}.
-Software developers have created version control systems precisely for this
kind of activity.
-Each significant moment in the project's history is called a ``commit'', see
@ref{Version controlled source}.
-A snapshot of the project in each ``commit'' is safely stored away, so you can
revert back to it at a later time, or check changes/progress.
-This way, you can be sure that your work is reproducible and track the
progress and history.
-With version control, experimentation in the project's analysis is greatly
facilitated, since you can easily revert back if a brainstorm test procedure
fails.
+@cindex FITS Tables
+@cindex Tables FITS
+@cindex ASCII table, FITS
+@item FITS ASCII tables
+The FITS ASCII table extension is fully in ASCII encoding and thus easily
readable on any text editor (assuming it is the only extension in the FITS
file).
+If the FITS file also contains binary extensions (for example, an image or
binary table extensions), then there will be many hard to print characters.
+The FITS ASCII format does not have new line characters to separate rows.
+In the FITS ASCII table standard, each row is defined as a fixed number of
characters (value to the @code{NAXIS1} keyword), so to visually inspect it
properly, you would have to adjust your text editor's width to this value.
+All columns start at given character positions and have a fixed width (number
of characters).
-One important feature of version control is that the research result (FITS
image, table, report or paper) can be stamped with the unique commit
information that produced it.
-This information will enable you to exactly reproduce that same result later,
even if you have made changes/progress.
-For one example of a research paper's reproduction pipeline, please see the
@url{https://gitlab.com/makhlaghi/NoiseChisel-paper, reproduction pipeline} of
the @url{https://arxiv.org/abs/1505.01664, paper} describing @ref{NoiseChisel}.
+Numbers in a FITS ASCII table are printed into ASCII format, they are not in
binary (that the CPU uses).
+Hence, they can take a larger space in memory, loose their precision, and take
longer to read into memory.
+If you are dealing with integer type columns (see @ref{Numeric data types}),
another issue with FITS ASCII tables is that the type information for the
column will be lost (there is only one integer type in FITS ASCII tables).
+One problem with the binary format on the other hand is that it is not
portable (different CPUs/compilers) have different standards for translating
the zeros and ones.
+But since ASCII characters are defined on a byte and are well recognized, they
are better for portability on those various systems.
+Gnuastro's plain text table format described below is much more portable and
easier to read/write/interpret by humans manually.
-@item CFITSIO
-The version of CFITSIO used (see @ref{CFITSIO}).
+Generally, as the name implies, this format is useful for when your table
mainly contains ASCII columns (for example, file names, or descriptions).
+They can be useful when you need to include columns with structured ASCII
information along with other extensions in one FITS file.
+In such cases, you can also consider header keywords (see @ref{Fits}).
-@item WCSLIB
-The version of WCSLIB used (see @ref{WCSLIB}).
-Note that older versions of WCSLIB do not report the version internally.
-So this is only available if you are using more recent WCSLIB versions.
+@cindex Binary table, FITS
+@item FITS binary tables
+The FITS binary table is the FITS standard's solution to the issues discussed
with keeping numbers in ASCII format as described under the FITS ASCII table
title above.
+Only columns defined as a string type (a string of ASCII characters) are
readable in a text editor.
+The portability problem with binary formats discussed above is mostly solved
thanks to the portability of CFITSIO (see @ref{CFITSIO}) and the very long
history of the FITS format which has been widely used since the 1970s.
-@item GSL
-The version of GNU Scientific Library that was used, see @ref{GNU Scientific
Library}.
+In the case of most numbers, storing them in binary format is more memory
efficient than ASCII format.
+For example, to store @code{-25.72034} in ASCII format, you need 9
bytes/characters.
+But if you keep this same number (to the approximate precision possible) as a
4-byte (32-bit) floating point number, you can keep/transmit it with less than
half the amount of memory.
+When catalogs contain thousands/millions of rows in tens/hundreds of columns,
this can lead to significant improvements in memory/band-width usage.
+Moreover, since the CPU does its operations in the binary formats, reading the
table in and writing it out is also much faster than an ASCII table.
-@item GNUASTRO
-The version of Gnuastro used (see @ref{Version numbering}).
-@end table
+When you are dealing with integer numbers, the compression ratio can be even
better, for example, if you know all of the values in a column are positive and
less than @code{255}, you can use the @code{unsigned char} type which only
takes one byte! If they are between @code{-128} and @code{127}, then you can
use the (signed) @code{char} type.
+So if you are thoughtful about the limits of your integer columns, you can
greatly reduce the size of your file and also the speed at which it is
read/written.
+This can be very useful when sharing your results with collaborators or
publishing them.
+To decrease the file size even more you can name your output as ending in
@file{.fits.gz} so it is also compressed after creation.
+Just note that compression/decompressing is CPU intensive and can slow down
the writing/reading of the file.
-Here is one example of the last few lines of an example output.
+Fortunately the FITS Binary table format also accepts ASCII strings as column
types (along with the various numerical types).
+So your dataset can also contain non-numerical columns.
-@example
- / Versions and date
-DATE = '...' / file creation date
-COMMIT = 'v0-8-g547f6eb' / Commit description in running dir.
-CFITSIO = '3.45 ' / CFITSIO version.
-WCSLIB = '5.19 ' / WCSLIB version.
-GSL = '2.5 ' / GNU Scientific Library version.
-GNUASTRO= '0.7 ' / GNU Astronomy Utilities version.
-END
-@end example
+@end table
-@node Numeric locale, , Output FITS files, Common program behavior
-@section Numeric locale
+@menu
+* Gnuastro text table format:: Reading plain text tables
+@end menu
-@cindex Locale
-@cindex @code{LC_ALL}
-@cindex @code{LC_NUMERIC}
-@cindex Decimal separator
-@cindex Language of command-line
-If your @url{https://en.wikipedia.org/wiki/Locale_(computer_software), system
locale} is not English, it may happen that the `.' is not used as the decimal
separator of basic command-line tools for input or output.
-For example, in Spanish and some other languages the decimal separator (symbol
used to separate the integer and fractional part of a number), is a comma.
-Therefore in such systems, some programs may print @mymath{0.5} as as
`@code{0,5}' (instead of `@code{0.5}').
-This mainly happens in some core operating system tools like @command{awk} or
@command{seq} depend on the locale.
-This can cause problems for other programs (like those in Gnuastro that expect
a `@key{.}' as the decimal separator).
+@node Gnuastro text table format, Selecting table columns, Recognized table
formats, Tables
+@subsection Gnuastro text table format
-To see the effect, please try the commands below.
-The first one will print @mymath{0.5} in your default locale's format.
-The second set will use the Spanish locale for printing numbers (which will
put a comma between the 0 and the 5).
-The third will use the English (US) locale for printing numbers (which will
put a point between the 0 and the 5).
+Plain text files are the most generic, portable, and easiest way to (manually)
create, (visually) inspect, or (manually) edit a table.
+In this format, the ending of a row is defined by the new-line character (a
line on a text editor).
+So when you view it on a text editor, every row will occupy one line.
+The delimiters (or characters separating the columns) are white space
characters (space, horizontal tab, vertical tab) and a comma (@key{,}).
+The only further requirement is that all rows/lines must have the same number
of columns.
+
+The columns do not have to be exactly under each other and the rows can be
arbitrarily long with different lengths.
+For example, the following contents in a file would be interpreted as a table
with 4 columns and 2 rows, with each element interpreted as a 64-bit floating
point type (see @ref{Numeric data types}).
@example
-$ seq 0.5 1
+1 2.234948 128 39.8923e8
+2 , 4.454 792 72.98348e7
+@end example
-$ export LC_NUMERIC=es_ES.utf8
-$ seq 0.5 1
+However, the example above has no other information about the columns (it is
just raw data, with no meta-data).
+To use this table, you have to remember what the numbers in each column
represent.
+Also, when you want to select columns, you have to count their position within
the table.
+This can become frustrating and prone to bad errors (getting the columns wrong
in your scientific project!) especially as the number of columns increase.
+It is also bad for sending to a colleague, because they will find it hard to
remember/use the columns properly.
-$ export LC_NUMERIC=en_US.utf8
-$ seq 0.5 1
-@end example
+To solve these problems in Gnuastro's programs/libraries you are not limited
to using the column's number, see @ref{Selecting table columns}.
+If the columns have names, units, or comments you can also select your columns
based on searches/matches in these fields, for example, see @ref{Table}.
+Also, in this manner, you cannot guide the program reading the table on how to
read the numbers.
+As an example, the first and third columns above can be read as integer types:
the first column might be an ID and the third can be the number of pixels an
object occupies in an image.
+So there is no need to read these to columns as a 64-bit floating point type
(which takes more memory, and is slower).
-@noindent
-With the simple command below, you can check your current locale environment
variables for specifying the formats of various things like date, time,
monetary, telephone, numbers, etc.
-You can change any of these, by simply giving different values to the
respective variable like above.
-For a more complete explanation on each variable, see
@url{https://www.baeldung.com/linux/locale-environment-variables}.
+In the bare-minimum example above, you also cannot use strings of characters,
for example, the names of filters, or some other identifier that includes
non-numerical characters.
+In the absence of any information, only numbers can be read robustly.
+Assuming we read columns with non-numerical characters as string, there would
still be the problem that the strings might contain space (or any delimiter)
character for some rows.
+So, each `word' in the string will be interpreted as a column and the program
will abort with an error that the rows do not have the same number of columns.
-@example
-$ locale
-@end example
+To correct for these limitations, Gnuastro defines the following convention
for storing the table meta-data along with the raw data in one plain text file.
+The format is primarily designed for ease of reading/writing by eye/fingers,
but is also structured enough to be read by a program.
-To avoid these kinds of locale-specific problems (for example, another program
not being able to read `@code{0,5}' as half of unity), you can change the
locale by giving the value of @code{C} to the @code{LC_NUMERIC} environment
variable (or the lower-level/generic @code{LC_ALL}).
-You will notice that @code{C} is not a human-language and country identifier
like @code{en_US}, it is the programming locale, which is well recognized by
programmers in all countries and is available on all Unix-like operating
systems (others may not be pre-defined and may need installation).
-You can set the @code{LC_NUMERIC} only for a single command (the first one
below: simply defining the variable in the same line), or all commands within
the running session (the second command below, or ``exporting'' it to all
subsequent commands):
+When the first non-white character in a line is @key{#}, or there are no
non-white characters in it, then the line will not be considered as a row of
data in the table (this is a pretty standard convention in many programs, and
higher level languages).
+In the first case (when the first character of the line is @key{#}), the line
is interpreted as a @emph{comment}.
-@example
-## Change the numeric locale, only for this 'seq' command.
-$ LC_NUMERIC=C seq 0.5 1
+If the comment line starts with `@code{# Column N:}', then it is assumed to
contain information about column @code{N} (a number, counting from 1).
+Comment lines that do not start with this pattern are ignored and you can use
them to include any further information you want to store with the table in the
text file.
+The most generic column information comment line has the following format:
-## Change the locale to the standard, for all commands after it.
-$ export LC_NUMERIC=C
+@example
+# Column N: NAME [UNIT, TYPE(NUM), BLANK] COMMENT
@end example
-If you want to change it generally for all future sessions, you can put the
second command in your shell's startup file.
-For more on startup files, please see @ref{Installation directory}.
-
+@cindex NaN
+@noindent
+Any sequence of characters between `@key{:}' and `@key{[}' will be interpreted
as the column name (so it can contain anything except the `@key{[}' character).
+Anything between the `@key{]}' and the end of the line is defined as a comment.
+Within the brackets, anything before the first `@key{,}' is the units
(physical units, for example, km/s, or erg/s), anything before the second
`@key{,}' is the short type identifier (see below, and @ref{Numeric data
types}).
+If the type identifier is not recognized, the default 64-bit floating point
type will be used.
+The type identifier can optionally be followed by an integer within
parenthesis.
+If the parenthesis is present and the integer is larger than 1, the column is
assumed to be a ``vector column'' (which can have multiple values, for more see
@ref{Vector columns}).
+Finally (still within the brackets), any non-white characters after the second
`@key{,}' are interpreted as the blank value for that column (see @ref{Blank
pixels}).
+The blank value can either be in the same type as the column (for example,
@code{-99} for a signed integer column), or any string (for example, @code{NaN}
in that same column).
+In both cases, the values will be stored in memory as Gnuastro's fixed blank
values for each type.
+For floating point types, Gnuastro's internal blank value is IEEE NaN
(Not-a-Number).
+For signed integers, it is the smallest possible value and for unsigned
integers its the largest possible value.
+When a formatting problem occurs, or when the column was already given
meta-data in a previous comment, or when the column number is larger than the
actual number of columns in the table (the non-commented or empty lines), then
the comment information line will be ignored.
+When a comment information line can be used, the leading and trailing white
space characters will be stripped from all of the elements.
+For example, in this line:
+@example
+# Column 5: column name [km/s, f32,-99] Redshift as speed
+@end example
+The @code{NAME} field will be `@code{column name}' and the @code{TYPE} field
will be `@code{f32}'.
+Note how all the white space characters before and after strings are not used,
but those in the middle remained.
+Also, white space characters are not mandatory.
+Hence, in the example above, the @code{BLANK} field will be given the value of
`@code{-99}'.
+Except for the column number (@code{N}), the rest of the fields are optional.
+Also, the column information comments do not have to be in order.
+In other words, the information for column @mymath{N+m} (@mymath{m>0}) can be
given in a line before column @mymath{N}.
+Furthermore, you do not have to specify information for all columns.
+Those columns that do not have this information will be interpreted with the
default settings (like the case above: values are double precision floating
point, and the column has no name, unit, or comment).
+So these lines are all acceptable for any table (the first one, with nothing
but the column number is redundant):
+@example
+# Column 5:
+# Column 1: ID [,i8] The Clump ID.
+# Column 3: mag_f160w [AB mag, f32] Magnitude from the F160W filter
+@end example
+@noindent
+The data type of the column should be specified with one of the following
values:
+@itemize
+@item
+For a numeric column, you can use any of the numeric types (and their
+recognized identifiers) described in @ref{Numeric data types}.
+@item
+`@code{strN}': for strings.
+The @code{N} value identifies the length of the string (how many characters it
has).
+The start of the string on each row is the first non-delimiter character of
the column that has the string type.
+The next @code{N} characters will be interpreted as a string and all leading
and trailing white space will be removed.
+If the next column's characters, are closer than @code{N} characters to the
start of the string column in that line/row, they will be considered part of
the string column.
+If there is a new-line character before the ending of the space given to the
string column (in other words, the string column is the last column), then
reading of the string will stop, even if the @code{N} characters are not
complete yet.
+See @file{tests/table/table.txt} for one example.
+Therefore, the only time you have to pay attention to the positioning and
spaces given to the string column is when it is not the last column in the
table.
-@node Data containers, Data manipulation, Common program behavior, Top
-@chapter Data containers
+The only limitation in this format is that trailing and leading white space
characters will be removed from the columns that are read.
+In most cases, this is the desired behavior, but if trailing and leading
white-spaces are critically important to your analysis, define your own
starting and ending characters and remove them after the table has been read.
+For example, in the sample table below, the two `@key{|}' characters (which
are arbitrary) will remain in the value of the second column and you can remove
them manually later.
+If only one of the leading or trailing white spaces is important for your
work, you can only use one of the `@key{|}'s.
-@cindex File operations
-@cindex Operations on files
-@cindex General file operations
-The most low-level and basic property of a dataset is how it is stored.
-To process, archive and transmit the data, you need a container to store it
first.
-From the start of the computer age, different formats have been defined to
store data, optimized for particular applications.
-One format/container can never be useful for all applications: the storage
defines the application and vice-versa.
-In astronomy, the Flexible Image Transport System (FITS) standard has become
the most common format of data storage and transmission.
-It has many useful features, for example, multiple sub-containers (also known
as extensions or header data units, HDUs) within one file, or support for
tables as well as images.
-Each HDU can store an independent dataset and its corresponding meta-data.
-Therefore, Gnuastro has one program (see @ref{Fits}) specifically designed to
manipulate FITS HDUs and the meta-data (header keywords) in each HDU.
+@example
+# Column 1: ID [label, u8]
+# Column 2: Notes [no unit, str50]
+1 leading and trailing white space is ignored here 2.3442e10
+2 | but they will be preserved here | 8.2964e11
+@end example
-Your astronomical research does not just involve data analysis (where the FITS
format is very useful).
-For example, you want to demonstrate your raw and processed FITS images or
spectra as figures within slides, reports, or papers.
-The FITS format is not defined for such applications.
-Thus, Gnuastro also comes with the ConvertType program (see @ref{ConvertType})
which can be used to convert a FITS image to and from (where possible) other
formats like plain text and JPEG (which allow two way conversion), along with
EPS and PDF (which can only be created from FITS, not the other way round).
+@end itemize
-Finally, the FITS format is not just for images, it can also store tables.
-Binary tables in particular can be very efficient in storing catalogs that
have more than a few tens of columns and rows.
-However, unlike images (where all elements/pixels have one data type), tables
contain multiple columns and each column can have different properties:
independent data types (see @ref{Numeric data types}) and meta-data.
-In practice, each column can be viewed as a separate container that is grouped
with others in the table.
-The only shared property of the columns in a table is thus the number of
elements they contain.
-To allow easy inspection/manipulation of table columns, Gnuastro has the Table
program (see @ref{Table}).
-It can be used to select certain table columns in a FITS table and see them as
a human readable output on the command-line, or to save them into another plain
text or FITS table.
+Note that the FITS binary table standard does not define the @code{unsigned
int} and @code{unsigned long} types, so if you want to convert your tables to
FITS binary tables, use other types.
+Also, note that in the FITS ASCII table, there is only one integer type
(@code{long}).
+So if you convert a Gnuastro plain text table to a FITS ASCII table with the
@ref{Table} program, the type information for integers will be lost.
+Conversely if integer types are important for you, you have to manually set
them when reading a FITS ASCII table (for example, with the Table program when
reading/converting into a file, or with the @file{gnuastro/table.h} library
functions when reading into memory).
-@menu
-* Fits:: View and manipulate extensions and keywords.
-* ConvertType:: Convert data to various formats.
-* Table:: Read and Write FITS tables to plain text.
-* Query:: Import data from external databases.
-@end menu
+@node Selecting table columns, , Gnuastro text table format, Tables
+@subsection Selecting table columns
+At the lowest level, the only defining aspect of a column in a table is its
number, or position.
+But selecting columns purely by number is not very convenient and, especially
when the tables are large it can be very frustrating and prone to errors.
+Hence, table file formats (for example, see @ref{Recognized table formats})
have ways to store additional information about the columns (meta-data).
+Some of the most common pieces of information about each column are its
@emph{name}, the @emph{units} of data in it, and a @emph{comment} for
longer/informal description of the column's data.
+To facilitate research with Gnuastro, you can select columns by matching, or
searching in these three fields, besides the low-level column number.
+To view the full list of information on the columns in the table, you can use
the Table program (see @ref{Table}) with the command below (replace
@file{table-file} with the filename of your table, if its FITS, you might also
need to specify the HDU/extension which contains the table):
+@example
+$ asttable --information table-file
+@end example
-@node Fits, ConvertType, Data containers, Data containers
-@section Fits
+Gnuastro's programs need the columns for different purposes, for example, in
Crop, you specify the columns containing the central coordinates of the crop
centers with the @option{--coordcol} option (see @ref{Crop options}).
+On the other hand, in MakeProfiles, to specify the column containing the
profile position angles, you must use the @option{--pcol} option (see
@ref{MakeProfiles catalog}).
+Thus, there can be no unified common option name to select columns for all
programs (different columns have different purposes).
+However, when the program expects a column for a specific context, the option
names end in the @option{col} suffix like the examples above.
+These options accept values in integer (column number), or string (metadata
match/search) format.
-@cindex Vatican library
-The ``Flexible Image Transport System'', or FITS, is by far the most common
data container format in astronomy and in constant use since the 1970s.
-Archiving (future usage, simplicity) has been one of the primary design
principles of this format.
-In the last few decades it has proved so useful and robust that the Vatican
Library has also chosen FITS for its ``long-term digital preservation''
project@footnote{@url{https://www.vaticanlibrary.va/home.php?pag=progettodigit}}.
-
-@cindex IAU, international astronomical union
-Although the full name of the standard invokes the idea that it is only for
images, it also contains complete and robust features for tables.
-It started off in the 1970s and was formally published as a standard in 1981,
it was adopted by the International Astronomical Union (IAU) in 1982 and an IAU
working group to maintain its future was defined in 1988.
-The FITS 2.0 and 3.0 standards were approved in 2000 and 2008 respectively,
and the 4.0 draft has also been released recently, please see the
@url{https://fits.gsfc.nasa.gov/fits_standard.html, FITS standard document web
page} for the full text of all versions.
-Also see the @url{https://doi.org/10.1051/0004-6361/201015362, FITS 3.0
standard paper} for a nice introduction and history along with the full
standard.
+If the value can be parsed as a positive integer, it will be seen as the
low-level column number.
+Note that column counting starts from 1, so if you ask for column 0, the
respective program will abort with an error.
+When the value cannot be interpreted as an a integer number, it will be seen
as a string of characters which will be used to match/search in the table's
meta-data.
+The meta-data field which the value will be compared with can be selected
through the @option{--searchin} option, see @ref{Input output options}.
+@option{--searchin} can take three values: @code{name}, @code{unit},
@code{comment}.
+The matching will be done following this convention:
-@cindex Meta-data
-Many common image formats, for example, a JPEG, only have one image/dataset
per file, however one great advantage of the FITS standard is that it allows
you to keep multiple datasets (images or tables along with their separate
meta-data) in one file.
-In the FITS standard, each data + metadata is known as an extension, or more
formally a header data unit or HDU.
-The HDUs in a file can be completely independent: you can have multiple images
of different dimensions/sizes or tables as separate extensions in one file.
-However, while the standard does not impose any constraints on the relation
between the datasets, it is strongly encouraged to group data that are
contextually related with each other in one file.
-For example, an image and the table/catalog of objects and their measured
properties in that image.
-Other examples can be images of one patch of sky in different colors
(filters), or one raw telescope image along with its calibration data (tables
or images).
+@itemize
+@item
+If the value is enclosed in two slashes (for example, @command{-x/RA_/}, or
@option{--coordcol=/RA_/}, see @ref{Crop options}), then it is assumed to be a
regular expression with the same convention as GNU AWK.
+GNU AWK has a very well written
@url{https://www.gnu.org/software/gawk/manual/html_node/Regexp.html, chapter}
describing regular expressions, so we will not continue discussing them here.
+Regular expressions are a very powerful tool in matching text and useful in
many contexts.
+We thus strongly encourage reviewing this chapter for greatly improving the
quality of your work in many cases, not just for searching column meta-data in
Gnuastro.
-As discussed above, the extensions in a FITS file can be completely
independent.
-To keep some information (meta-data) about the group of extensions in the FITS
file, the community has adopted the following convention: put no data in the
first extension, so it is just meta-data.
-This extension can thus be used to store Meta-data regarding the whole file
(grouping of extensions).
-Subsequent extensions may contain data along with their own separate meta-data.
-All of Gnuastro's programs also follow this convention: the main output
dataset(s) are placed in the second (or later) extension(s).
-The first extension contains no data the program's configuration (input file
name, along with all its option values) are stored as its meta-data, see
@ref{Output FITS files}.
+@item
+When the string is not enclosed between `@key{/}'s, any column that exactly
matches the given value in the given field will be selected.
+@end itemize
-The meta-data contain information about the data, for example, which region of
the sky an image corresponds to, the units of the data, what telescope, camera,
and filter the data were taken with, it observation date, or the software that
produced it and its configuration.
-Without the meta-data, the raw dataset is practically just a collection of
numbers and really hard to understand, or connect with the real world (other
datasets).
-It is thus strongly encouraged to supplement your data (at any level of
processing) with as much meta-data about your processing/science as possible.
+Note that in both cases, you can ignore the case of alphabetic characters with
the @option{--ignorecase} option, see @ref{Input output options}.
+Also, in both cases, multiple columns may be selected with one call to this
function.
+In this case, the order of the selected columns (with one call) will be the
same order as they appear in the table.
-The meta-data of a FITS file is in ASCII format, which can be easily viewed or
edited with a text editor or on the command-line.
-Each meta-data element (known as a keyword generally) is composed of a name,
value, units and comments (the last two are optional).
-For example, below you can see three FITS meta-data keywords for specifying
the world coordinate system (WCS, or its location in the sky) of a dataset:
-@example
-LATPOLE = -27.805089 / [deg] Native latitude of celestial pole
-RADESYS = 'FK5' / Equatorial coordinate system
-EQUINOX = 2000.0 / [yr] Equinox of equatorial coordinates
-@end example
-However, there are some limitations which discourage viewing/editing the
keywords with text editors.
-For example, there is a fixed length of 80 characters for each keyword (its
name, value, units and comments) and there are no new-line characters, so on a
text editor all the keywords are seen in one line.
-Also, the meta-data keywords are immediately followed by the data which are
commonly in binary format and will show up as strange looking characters on a
text editor, and significantly slowing down the processor.
-Gnuastro's Fits program was designed to allow easy manipulation of FITS
extensions and meta-data keywords on the command-line while conforming fully
with the FITS standard.
-For example, you can copy or cut (copy and remove) HDUs/extensions from one
FITS file to another, or completely delete them.
-It also has features to delete, add, or edit meta-data keywords within one HDU.
-@menu
-* Invoking astfits:: Arguments and options to Header.
-@end menu
+@node Tessellation, Automatic output, Tables, Common program behavior
+@section Tessellation
-@node Invoking astfits, , Fits, Fits
-@subsection Invoking Fits
+It is sometimes necessary to classify the elements in a dataset (for example,
pixels in an image) into a grid of individual, non-overlapping tiles.
+For example, when background sky gradients are present in an image, you can
define a tile grid over the image.
+When the tile sizes are set properly, the background's variation over each
tile will be negligible, allowing you to measure (and subtract) it.
+In other cases (for example, spatial domain convolution in Gnuastro, see
@ref{Convolve}), it might simply be for speed of processing: each tile can be
processed independently on a separate CPU thread.
+In the arts and mathematics, this process is formally known as
@url{https://en.wikipedia.org/wiki/Tessellation, tessellation}.
-Fits can print or manipulate the FITS file HDUs (extensions), meta-data
keywords in a given HDU.
-The executable name is @file{astfits} with the following general template
+The size of the regular tiles (in units of data-elements, or pixels in an
image) can be defined with the @option{--tilesize} option.
+It takes multiple numbers (separated by a comma) which will be the length
along the respective dimension (in FORTRAN/FITS dimension order).
+Divisions are also acceptable, but must result in an integer.
+For example, @option{--tilesize=30,40} can be used for an image (a 2D dataset).
+The regular tile size along the first FITS axis (horizontal when viewed in SAO
DS9) will be 30 pixels and along the second it will be 40 pixels.
+Ideally, @option{--tilesize} should be selected such that all tiles in the
image have exactly the same size.
+In other words, that the dataset length in each dimension is divisible by the
tile size in that dimension.
-@example
-$ astfits [OPTION...] ASTRdata
-@end example
+However, this is not always possible: the dataset can be any size and every
pixel in it is valuable.
+In such cases, Gnuastro will look at the significance of the remainder length,
if it is not significant (for example, one or two pixels), then it will just
increase the size of the first tile in the respective dimension and allow the
rest of the tiles to have the required size.
+When the remainder is significant (for example, one pixel less than the size
along that dimension), the remainder will be added to one regular tile's size
and the large tile will be cut in half and put in the two ends of the
grid/tessellation.
+In this way, all the tiles in the central regions of the dataset will have the
regular tile sizes and the tiles on the edge will be slightly larger/smaller
depending on the remainder significance.
+The fraction which defines the remainder significance along all dimensions can
be set through @option{--remainderfrac}.
+The best tile size is directly related to the spatial properties of the
property you want to study (for example, gradient on the image).
+In practice we assume that the gradient is not present over each tile.
+So if there is a strong gradient (for example, in long wavelength ground based
images) or the image is of a crowded area where there is not too much blank
area, you have to choose a smaller tile size.
+A larger mesh will give more pixels and so the scatter in the results will be
less (better statistics).
-@noindent
-One line examples:
+@cindex CCD
+@cindex Amplifier
+@cindex Bias current
+@cindex Subaru Telescope
+@cindex Hyper Suprime-Cam
+@cindex Hubble Space Telescope (HST)
+For raw image processing, a single tessellation/grid is not sufficient.
+Raw images are the unprocessed outputs of the camera detectors.
+Modern detectors usually have multiple readout channels each with its own
amplifier.
+For example, the Hubble Space Telescope Advanced Camera for Surveys (ACS) has
four amplifiers over its full detector area dividing the square field of view
to four smaller squares.
+Ground based image detectors are not exempt, for example, each CCD of Subaru
Telescope's Hyper Suprime-Cam camera (which has 104 CCDs) has four amplifiers,
but they have the same height of the CCD and divide the width by four parts.
-@example
-## View general information about every extension:
-$ astfits image.fits
+@cindex Channel
+The bias current on each amplifier is different, and initial bias subtraction
is not perfect.
+So even after subtracting the measured bias current, you can usually still
identify the boundaries of different amplifiers by eye.
+See Figure 11(a) in Akhlaghi and Ichikawa (2015) for an example.
+This results in the final reduced data to have non-uniform amplifier-shaped
regions with higher or lower background flux values.
+Such systematic biases will then propagate to all subsequent measurements we
do on the data (for example, photometry and subsequent stellar mass and star
formation rate measurements in the case of galaxies).
-## Print the header keywords in the second HDU (counting from 0):
-$ astfits image.fits -h1
+Therefore an accurate analysis requires a two layer tessellation: the top
layer contains larger tiles, each covering one amplifier channel.
+For clarity we will call these larger tiles ``channels''.
+The number of channels along each dimension is defined through the
@option{--numchannels}.
+Each channel is then covered by its own individual smaller tessellation (with
tile sizes determined by the @option{--tilesize} option).
+This will allow independent analysis of two adjacent pixels from different
channels if necessary.
+If the image is processed or the detector only has one amplifier, you can set
the number of channels in both dimension to 1.
-## Only print header keywords that contain `NAXIS':
-$ astfits image.fits -h1 | grep NAXIS
+The final tessellation can be inspected on the image with the
@option{--checktiles} option that is available to all programs which use
tessellation for localized operations.
+When this option is called, a FITS file with a @file{_tiled.fits} suffix will
be created along with the outputs, see @ref{Automatic output}.
+Each pixel in this image has the number of the tile that covers it.
+If the number of channels in any dimension are larger than unity, you will
notice that the tile IDs are defined such that the first channels is covered
first, then the second and so on.
+For the full list of processing-related common options (including tessellation
options), please see @ref{Processing options}.
-## Only print the WCS standard PC matrix elements
-$ astfits image.fits -h1 | grep 'PC._.'
-## Copy a HDU from input.fits to out.fits:
-$ astfits input.fits --copy=hdu-name --output=out.fits
-## Update the OLDKEY keyword value to 153.034:
-$ astfits --update=OLDKEY,153.034,"Old keyword comment"
-## Delete one COMMENT keyword and add a new one:
-$ astfits --delete=COMMENT --comment="Anything you like ;-)."
-## Write two new keywords with different values and comments:
-$ astfits --write=MYKEY1,20.00,"An example keyword" --write=MYKEY2,fd
+@node Automatic output, Output FITS files, Tessellation, Common program
behavior
+@section Automatic output
-## Inspect individual pixel area taken based on its WCS (in degree^2).
-## Then convert the area to arcsec^2 with the Arithmetic program.
-$ astfits input.fits --pixelareaonwcs -o pixarea.fits
-$ astarithmetic pixarea.fits 3600 3600 x x -o pixarea_arcsec2.fits
-@end example
+@cindex Standard input
+@cindex Automatic output file names
+@cindex Output file names, automatic
+@cindex Setting output file names automatically
+All the programs in Gnuastro are designed such that specifying an output file
or directory (based on the program context) is optional.
+When no output name is explicitly given (with @option{--output}, see
@ref{Input output options}), the programs will automatically set an output name
based on the input name(s) and what the program does.
+For example, when you are using ConvertType to save FITS image named
@file{dataset.fits} to a JPEG image and do not specify a name for it, the JPEG
output file will be name @file{dataset.jpg}.
+When the input is from the standard input (for example, a pipe, see
@ref{Standard input}), and @option{--output} is not given, the output name will
be the program's name (for example, @file{converttype.jpg}).
-@cindex HDU
-@cindex HEALPix
-When no action is requested (and only a file name is given), Fits will print a
list of information about the extension(s) in the file.
-This information includes the HDU number, HDU name (@code{EXTNAME} keyword),
type of data (see @ref{Numeric data types}, and the number of data elements it
contains (size along each dimension for images and table rows and columns).
-Optionally, a comment column is printed for special situations (like a 2D
HEALPix grid that is usually stored as a 1D dataset/table).
-You can use this to get a general idea of the contents of the FITS file and
what HDU to use for further processing, either with the Fits program or any
other Gnuastro program.
+@vindex --keepinputdir
+Another very important part of the automatic output generation is that all the
directory information of the input file name is stripped off of it.
+This feature can be disabled with the @option{--keepinputdir} option, see
@ref{Input output options}.
+It is the default because astronomical data are usually very large and
organized specially with special file names.
+In some cases, the user might not have write permissions in those
directories@footnote{In fact, even if the data is stored on your own computer,
it is advised to only grant write permissions to the super user or root.
+This way, you will not accidentally delete or modify your valuable data!}.
-Here is one example of information about a FITS file with four extensions: the
first extension has no data, it is a purely meta-data HDU (commonly used to
keep meta-data about the whole file, or grouping of extensions, see @ref{Fits}).
-The second extension is an image with name @code{IMAGE} and single precision
floating point type (@code{float32}, see @ref{Numeric data types}), it has 4287
pixels along its first (horizontal) axis and 4286 pixels along its second
(vertical) axis.
-The third extension is also an image with name @code{MASK}.
-It is in 2-byte integer format (@code{int16}) which is commonly used to keep
information about pixels (for example, to identify which ones were saturated,
or which ones had cosmic rays and so on), note how it has the same size as the
@code{IMAGE} extension.
-The third extension is a binary table called @code{CATALOG} which has 12371
rows and 5 columns (it probably contains information about the sources in the
image).
+Let's assume that we are working on a report and want to process the FITS
images from two projects (ABC and DEF), which are stored in the sub-directories
named @file{ABCproject/} and @file{DEFproject/} of our top data directory
(@file{/mnt/data}).
+The following shell commands show how one image from the former is first
converted to a JPEG image through ConvertType and then the objects from an
image in the latter project are detected using NoiseChisel.
+The text after the @command{#} sign are comments (not typed!).
@example
-GNU Astronomy Utilities X.X
-Run on Day Month DD HH:MM:SS YYYY
------
-HDU (extension) information: `image.fits'.
- Column 1: Index (counting from 0).
- Column 2: Name (`EXTNAME' in FITS standard).
- Column 3: Image data type or `table' format (ASCII or binary).
- Column 4: Size of data in HDU.
------
-0 n/a uint8 0
-1 IMAGE float32 4287x4286
-2 MASK int16 4287x4286
-3 CATALOG table_binary 12371x5
+$ pwd # Current location
+/home/usrname/research/report
+$ ls # List directory contents
+ABC01.jpg
+$ ls /mnt/data/ABCproject # Archive 1
+ABC01.fits ABC02.fits ABC03.fits
+$ ls /mnt/data/DEFproject # Archive 2
+DEF01.fits DEF02.fits DEF03.fits
+$ astconvertt /mnt/data/ABCproject/ABC02.fits --output=jpg # Prog 1
+$ ls
+ABC01.jpg ABC02.jpg
+$ astnoisechisel /mnt/data/DEFproject/DEF01.fits # Prog 2
+$ ls
+ABC01.jpg ABC02.jpg DEF01_detected.fits
@end example
-If a specific HDU is identified on the command-line with the @option{--hdu}
(or @option{-h} option) and no operation requested, then the full list of
header keywords in that HDU will be printed (as if the @option{--printallkeys}
was called, see below).
-It is important to remember that this only occurs when @option{--hdu} is given
on the command-line.
-The @option{--hdu} value given in a configuration file will only be used when
a specific operation on keywords requested.
-Therefore as described in the paragraphs above, when no explicit call to the
@option{--hdu} option is made on the command-line and no operation is requested
(on the command-line or configuration files), the basic information of each
HDU/extension is printed.
-
-The operating mode and input/output options to Fits are similar to the other
programs and fully described in @ref{Common options}.
-The options particular to Fits can be divided into three groups:
-1) those related to modifying HDUs or extensions (see @ref{HDU information and
manipulation}), and
-2) those related to viewing/modifying meta-data keywords (see @ref{Keyword
inspection and manipulation}).
-3) those related to creating meta-images where each pixel shows values for a
specific property of the image (see @ref{Pixel information images}).
-These three classes of options cannot be called together in one run: you can
either work on the extensions, meta-data keywords in any instance of Fits, or
create meta-images where each pixel shows a particular information about the
image itself.
-@menu
-* HDU information and manipulation:: Learn about the HDUs and move them.
-* Keyword inspection and manipulation:: Manipulate metadata keywords in a HDU.
-* Pixel information images:: Pixel values contain information on the pixels.
-@end menu
+@node Output FITS files, Numeric locale, Automatic output, Common program
behavior
+@section Output FITS files
+@cindex FITS
+@cindex Output FITS headers
+@cindex CFITSIO version on outputs
+The output of many of Gnuastro's programs are (or can be) FITS files.
+The FITS format has many useful features for storing scientific datasets
(cubes, images and tables) along with a robust features for archivability.
+For more on this standard, please see @ref{Fits}.
-@node HDU information and manipulation, Keyword inspection and manipulation,
Invoking astfits, Invoking astfits
-@subsubsection HDU information and manipulation
-Each FITS file header data unit, or HDU (also known as an extension) is an
independent dataset (data + meta-data).
-Multiple HDUs can be stored in one FITS file, see @ref{Fits}.
-The general HDU-related options to the Fits program are listed below as two
general classes:
-the first group below focus on HDU information while the latter focus on
manipulating (moving or deleting) the HDUs.
+As a community convention described in @ref{Fits}, the first extension of all
FITS files produced by Gnuastro's programs only contains the meta-data that is
intended for the file's extension(s).
+For a Gnuastro program, this generic meta-data (that is stored as FITS keyword
records) is its configuration when it produced this dataset: file name(s) of
input(s) and option names, values and comments.
+Note that when the configuration is too trivial (only input filename, for
example, the program @ref{Table}) no meta-data is written in this extension.
-The options below print information about the given HDU on the command-line.
-Thus they cannot be called together in one command (each has its own
independent output).
+FITS keywords have the following limitations in regards to generic option
names and values which are described below:
-@table @option
-@item -n
-@itemx --numhdus
-Print the number of extensions/HDUs in the given file.
-Note that this option must be called alone and will only print a single number.
-It is thus useful in scripts, for example, when you need to do check the
number of extensions in a FITS file.
+@itemize
+@item
+If a keyword (option name) is longer than 8 characters, the first word in the
record (80 character line) is @code{HIERARCH} which is followed by the keyword
name.
-For a complete list of basic meta-data on the extensions in a FITS file, do
not use any of the options in this section or in @ref{Keyword inspection and
manipulation}.
-For more, see @ref{Invoking astfits}.
+@item
+Values can be at most 75 characters, but for strings, this changes to 73
(because of the two extra @key{'} characters that are necessary).
+However, if the value is a file name, containing slash (@key{/}) characters to
separate directories, Gnuastro will break the value into multiple keywords.
-@item --hastablehdu
-Print @code{1} (on standard output) if at least one table HDU (ASCII or
binary) exists in the FITS file.
-Otherwise (when no table HDU exists in the file), print @code{0}.
+@item
+Keyword names ignore case, therefore they are all in capital letters.
+Therefore, if you want to use Grep to inspect these keywords, use the
@option{-i} option, like the example below.
-@item --listtablehdus
-Print the names or numbers (when a name does not exist, counting from zero) of
HDUs that contain a table (ASCII or Binary) on standard output, one per line.
-Otherwise (when no table HDU exists in the file) nothing will be printed.
+@example
+$ astfits image_detected.fits -h0 | grep -i snquant
+@end example
+@end itemize
-@item --hasimagehdu
-Print @code{1} (on standard output) if at least one image HDU exists in the
FITS file.
-Otherwise (when no image HDU exists in the file), print @code{0}.
+The keywords above are classified (separated by an empty line and title) as a
group titled ``ProgramName configuration''.
+This meta-data extension, as well as all the other extensions (which contain
data), also contain have final group of keywords to keep the basic date and
version information of Gnuastro, its dependencies and the pipeline that is
using Gnuastro (if it is under version control).
-In the FITS standard, any array with any dimensions is called an ``image'',
therefore this option includes 1, 3 and 4 dimensional arrays too.
-However, an image HDU with zero dimensions (which is usually the first
extension and only contains metadata) is not counted here.
+@table @command
-@item --listimagehdus
-Print the names or numbers (when a name does not exist, counting from zero) of
HDUs that contain an image on standard output, one per line.
-Otherwise (when no image HDU exists in the file) nothing will be printed.
+@item DATE
+The creation time of the FITS file.
+This date is written directly by CFITSIO and is in UT format.
-In the FITS standard, any array with any dimensions is called an ``image'',
therefore this option includes 1, 3 and 4 dimensional arrays too.
-However, an image HDU with zero dimensions (which is usually the first
extension and only contains metadata) is not counted here.
+@item COMMIT
+Git's commit description from the running directory of Gnuastro's programs.
+If the running directory is not version controlled or @file{libgit2} is not
installed (see @ref{Optional dependencies}) then this keyword will not be
present.
+The printed value is equivalent to the output of the following command:
-@item --listallhdus
-Print the names or numbers (when a name does not exist, counting from zero) of
all HDUs within the input file on the standard output, one per line.
+@example
+git describe --dirty --always
+@end example
-@item --pixelscale
-Print the HDU's pixel-scale (change in world coordinate for one pixel along
each dimension) and pixel area or voxel volume.
-Without the @option{--quiet} option, the output of @option{--pixelscale} has
multiple lines and explanations, thus being more human-friendly.
-It prints the file/HDU name, number of dimensions, and the units along with
the actual pixel scales.
-Also, when any of the units are in degrees, the pixel scales and area/volume
are also printed in units of arc-seconds.
-For 3D datasets, the pixel area (on each 2D slice of the 3D cube) is printed
as well as the voxel volume.
-If you only want the pixel area of a 2D image in units of arcsec@mymath{^2}
you can use @option{--pixelareaarcsec2} described below.
+If the running directory contains non-committed work, then the stored value
will have a `@command{-dirty}' suffix.
+This can be very helpful to let you know that the data is not ready to be
shared with collaborators or submitted to a journal.
+You should only share results that are produced after all your work is
committed (safely stored in the version controlled history and thus
reproducible).
-However, in scripts (that are to be run automatically), this human-friendly
format is annoying, so when called with the @option{--quiet} option, only the
pixel-scale value(s) along each dimension is(are) printed in one line.
-These numbers are followed by the pixel area (in the raw WCS units).
-For 3D datasets, this will be area on each 2D slice.
-Finally, for 3D datasets, a final number (the voxel volume) is printed.
-As a summary, in @option{--quiet} mode, for 2D datasets three numbers are
printed and for 3D datasets, 5 numbers are printed.
-If the dataset has more than 3 dimensions, only the pixel-scale values are
printed (no area or volume will be printed).
+At first sight, version control appears to be mainly a tool for software
developers.
+However progress in a scientific research is almost identical to progress in
software development: first you have a rough idea that starts with handful of
easy steps.
+But as the first results appear to be promising, you will have to extend, or
generalize, it to make it more robust and work in all the situations your
research covers, not just your first test samples.
+Slowly you will find wrong assumptions or bad implementations that need to be
fixed (`bugs' in software development parlance).
+Finally, when you submit the research to your collaborators or a journal, many
comments and suggestions will come in, and you have to address them.
-@item --pixelareaarcsec2
-Print the HDU's pixel area in units of arcsec@mymath{^2}.
-This option only works on 2D images, that have WCS coordinates in units of
degrees.
-For lower-level information about the pixel scale in each dimension, see
@option{--pixelscale} (described above).
+Software developers have created version control systems precisely for this
kind of activity.
+Each significant moment in the project's history is called a ``commit'', see
@ref{Version controlled source}.
+A snapshot of the project in each ``commit'' is safely stored away, so you can
revert back to it at a later time, or check changes/progress.
+This way, you can be sure that your work is reproducible and track the
progress and history.
+With version control, experimentation in the project's analysis is greatly
facilitated, since you can easily revert back if a brainstorm test procedure
fails.
-@item --skycoverage
-@cindex Image's sky coverage
-@cindex Coverage of image over sky
-Print the rectangular area (or 3D cube) covered by the given image/datacube
HDU over the Sky in the WCS units.
-The covered area is reported in two ways:
-1) the center and full width in each dimension,
-2) the minimum and maximum sky coordinates in each dimension.
-This is option is thus useful when you want to get a general feeling of a new
image/dataset, or prepare the inputs to query external databases in the region
of the image (for example, with @ref{Query}).
+One important feature of version control is that the research result (FITS
image, table, report or paper) can be stamped with the unique commit
information that produced it.
+This information will enable you to exactly reproduce that same result later,
even if you have made changes/progress.
+For one example of a research paper's reproduction pipeline, please see the
@url{https://gitlab.com/makhlaghi/NoiseChisel-paper, reproduction pipeline} of
the @url{https://arxiv.org/abs/1505.01664, paper} describing @ref{NoiseChisel}.
-If run without the @option{--quiet} option, the values are given with a
human-friendly description.
-For example, here is the output of this option on an image taken near the star
Castor:
+@item CFITSIO
+The version of CFITSIO used (see @ref{CFITSIO}).
-@example
-$ astfits castor.fits --skycoverage
-Input file: castor.fits (hdu: 1)
+@item WCSLIB
+The version of WCSLIB used (see @ref{WCSLIB}).
+Note that older versions of WCSLIB do not report the version internally.
+So this is only available if you are using more recent WCSLIB versions.
-Sky coverage by center and (full) width:
- Center: 113.9149075 31.93759664
- Width: 2.41762045 2.67945253
+@item GSL
+The version of GNU Scientific Library that was used, see @ref{GNU Scientific
Library}.
-Sky coverage by range along dimensions:
- RA 112.7235592 115.1411797
- DEC 30.59262123 33.27207376
-@end example
+@item GNUASTRO
+The version of Gnuastro used (see @ref{Version numbering}).
+@end table
-With the @option{--quiet} option, the values are more machine-friendly (easy
to parse).
-It has two lines, where the first line contains the center/width values and
the second line shows the coordinate ranges in each dimension.
+Here is one example of the last few lines of an example output.
@example
-$ astfits castor.fits --skycoverage --quiet
-113.9149075 31.93759664 2.41762045 2.67945253
-112.7235592 115.1411797 30.59262123 33.27207376
+ / Versions and date
+DATE = '...' / file creation date
+COMMIT = 'v0-8-g547f6eb' / Commit description in running dir.
+CFITSIO = '3.45 ' / CFITSIO version.
+WCSLIB = '5.19 ' / WCSLIB version.
+GSL = '2.5 ' / GNU Scientific Library version.
+GNUASTRO= '0.7 ' / GNU Astronomy Utilities version.
+END
@end example
-Note that this is a simple rectangle (cube in 3D) definition, so if the image
is rotated in relation to the celestial coordinates a general polygon is
necessary to exactly describe the coverage.
-Hence when there is rotation, the reported area will be larger than the actual
area containing data, you can visually see the area with the
@option{--pixelareaonwcs} option of @ref{Fits}.
-
-Currently this option only supports images that are less than 180 degrees in
width (which is usually the case!).
-This requirement has been necessary to account for images that cross the RA=0
hour circle on the sky.
-Please get in touch with us at @url{mailto:bug-gnuastro@@gnu.org} if you have
an image that is larger than 180 degrees so we try to find a solution based on
need.
-
-@item --datasum
-@cindex @code{DATASUM}: FITS keyword
-Calculate and print the given HDU's "datasum" to stdout.
-The given HDU is specified with the @option{--hdu} (or @option{-h}) option.
-This number is calculated by parsing all the bytes of the given HDU's data
records (excluding keywords).
-This option ignores any possibly existing @code{DATASUM} keyword in the HDU.
-For more on @code{DATASUM} in the FITS standard, see @ref{Keyword inspection
and manipulation} (under the @code{checksum} component of @option{--write}).
-
-You can use this option to confirm that the data in two different HDUs
(possibly with different keywords) is identical.
-Its advantage over @option{--write=datasum} (which writes the @code{DATASUM}
keyword into the given HDU) is that it does not require write permissions.
-@end table
+@node Numeric locale, , Output FITS files, Common program behavior
+@section Numeric locale
-The following options manipulate (move/delete) the HDUs in one FITS file or to
another FITS file.
-These options may be called multiple times in one run.
-If so, the extensions will be copied from the input FITS file to the output
FITS file in the given order (on the command-line and also in configuration
files, see @ref{Configuration file precedence}).
-If the separate classes are called together in one run of Fits, then first
@option{--copy} is run (on all specified HDUs), followed by @option{--cut}
(again on all specified HDUs), and then @option{--remove} (on all specified
HDUs).
+@cindex Locale
+@cindex @code{LC_ALL}
+@cindex @code{LC_NUMERIC}
+@cindex Decimal separator
+@cindex Language of command-line
+If your @url{https://en.wikipedia.org/wiki/Locale_(computer_software), system
locale} is not English, it may happen that the `.' is not used as the decimal
separator of basic command-line tools for input or output.
+For example, in Spanish and some other languages the decimal separator (symbol
used to separate the integer and fractional part of a number), is a comma.
+Therefore in such systems, some programs may print @mymath{0.5} as as
`@code{0,5}' (instead of `@code{0.5}').
+This mainly happens in some core operating system tools like @command{awk} or
@command{seq} depend on the locale.
+This can cause problems for other programs (like those in Gnuastro that expect
a `@key{.}' as the decimal separator).
-The @option{--copy} and @option{--cut} options need an output FITS file
(specified with the @option{--output} option).
-If the output file exists, then the specified HDU will be copied following the
last extension of the output file (the existing HDUs in it will be untouched).
-Thus, after Fits finishes, the copied HDU will be the last HDU of the output
file.
-If no output file name is given, then automatic output will be used to store
the HDUs given to this option (see @ref{Automatic output}).
+To see the effect, please try the commands below.
+The first one will print @mymath{0.5} in your default locale's format.
+The second set will use the Spanish locale for printing numbers (which will
put a comma between the 0 and the 5).
+The third will use the English (US) locale for printing numbers (which will
put a point between the 0 and the 5).
-@table @option
+@example
+$ seq 0.5 1
-@item -C STR
-@itemx --copy=STR
-Copy the specified extension into the output file, see explanations above.
+$ export LC_NUMERIC=es_ES.utf8
+$ seq 0.5 1
-@item -k STR
-@itemx --cut=STR
-Cut (copy to output, remove from input) the specified extension into the
-output file, see explanations above.
+$ export LC_NUMERIC=en_US.utf8
+$ seq 0.5 1
+@end example
-@item -R STR
-@itemx --remove=STR
-Remove the specified HDU from the input file.
+@noindent
+With the simple command below, you can check your current locale environment
variables for specifying the formats of various things like date, time,
monetary, telephone, numbers, etc.
+You can change any of these, by simply giving different values to the
respective variable like above.
+For a more complete explanation on each variable, see
@url{https://www.baeldung.com/linux/locale-environment-variables}.
-The first (zero-th) HDU cannot be removed with this option.
-Consider using @option{--copy} or @option{--cut} in combination with
@option{primaryimghdu} to not have an empty zero-th HDU.
-From CFITSIO: ``In the case of deleting the primary array (the first HDU in
the file) then [it] will be replaced by a null primary array containing the
minimum set of required keywords and no data.''.
-So in practice, any existing data (array) and meta-data in the first extension
will be removed, but the number of extensions in the file will not change.
-This is because of the unique position the first FITS extension has in the
FITS standard (for example, it cannot be used to store tables).
+@example
+$ locale
+@end example
-@item --primaryimghdu
-Copy or cut an image HDU to the zero-th HDU/extension a file that does not yet
exist.
-This option is thus irrelevant if the output file already exists or the
copied/cut extension is a FITS table.
-For example, with the commands below, first we make sure that @file{out.fits}
does not exist, then we copy the first extension of @file{in.fits} to the
zero-th extension of @file{out.fits}.
+To avoid these kinds of locale-specific problems (for example, another program
not being able to read `@code{0,5}' as half of unity), you can change the
locale by giving the value of @code{C} to the @code{LC_NUMERIC} environment
variable (or the lower-level/generic @code{LC_ALL}).
+You will notice that @code{C} is not a human-language and country identifier
like @code{en_US}, it is the programming locale, which is well recognized by
programmers in all countries and is available on all Unix-like operating
systems (others may not be pre-defined and may need installation).
+You can set the @code{LC_NUMERIC} only for a single command (the first one
below: simply defining the variable in the same line), or all commands within
the running session (the second command below, or ``exporting'' it to all
subsequent commands):
@example
-$ rm -f out.fits
-$ astfits in.fits --copy=1 --primaryimghdu --output=out.fits
+## Change the numeric locale, only for this 'seq' command.
+$ LC_NUMERIC=C seq 0.5 1
+
+## Change the locale to the standard, for all commands after it.
+$ export LC_NUMERIC=C
@end example
-If we had not used @option{--primaryimghdu}, then the zero-th extension of
@file{out.fits} would have no data, and its second extension would host the
copied image (just like any other output of Gnuastro).
+If you want to change it generally for all future sessions, you can put the
second command in your shell's startup file.
+For more on startup files, please see @ref{Installation directory}.
-@end table
-@node Keyword inspection and manipulation, Pixel information images, HDU
information and manipulation, Invoking astfits
-@subsubsection Keyword inspection and manipulation
-The meta-data in each header data unit, or HDU (also known as extension, see
@ref{Fits}) is stored as ``keyword''s.
-Each keyword consists of a name, value, unit, and comments.
-The Fits program (see @ref{Fits}) options related to viewing and manipulating
keywords in a FITS HDU are described below.
-First, let's review the @option{--keyvalue} option which should be called
separately from the rest of the options described in this section.
-Also, unlike the rest of the options in this section, with
@option{--keyvalue}, you can give more than one input file.
-@table @option
-@item -l STR[,STR[,...]
-@itemx --keyvalue=STR[,STR[,...]
-Only print the value of the requested keyword(s): the @code{STR}s.
-@option{--keyvalue} can be called multiple times, and each call can contain
multiple comma-separated keywords.
-If more than one file is given, this option uses the same HDU/extension for
all of them (value to @option{--hdu}).
-For example, you can get the number of dimensions of the three FITS files in
the running directory, as well as the length along each dimension, with this
command:
-@example
-$ astfits *.fits --keyvalue=NAXIS,NAXIS1 --keyvalue=NAXIS2
-image-a.fits 2 774 672
-image-b.fits 2 774 672
-image-c.fits 2 387 336
-@end example
-If only one input is given, and the @option{--quiet} option is activated, the
file name is not printed on the first column, only the values of the requested
keywords.
-@example
-$ astfits image-a.fits --keyvalue=NAXIS,NAXIS1 \
- --keyvalue=NAXIS2 --quiet
-2 774 672
-@end example
-The output is internally stored (and finally printed) as a table (with one
column per keyword).
-Therefore just like the Table program, you can use @option{--colinfoinstdout}
to print the metadata like the example below (also see @ref{Invoking asttable}).
-The keyword metadata (comments and units) are extracted from the comments and
units of the keyword in the input files (first file that has a comment or unit).
-Hence if the keyword does not have units or comments in any of the input
files, they will be empty.
-For more on Gnuastro's plain-text metadata format, see @ref{Gnuastro text
table format}.
-@example
-$ astfits *.fits --keyvalue=NAXIS,NAXIS1,NAXIS2 \
- --colinfoinstdout
-# Column 1: FILENAME [name,str10,] Name of input file.
-# Column 2: NAXIS [ ,u8 ,] number of data axes
-# Column 3: NAXIS1 [ ,u16 ,] length of data axis 1
-# Column 4: NAXIS2 [ ,u16 ,] length of data axis 2
-image-a.fits 2 774 672
-image-b.fits 2 774 672
-image-c.fits 2 387 336
-@end example
-Another advantage of a table output is that you can directly write the table
to a file.
-For example, if you add @option{--output=fileinfo.fits}, the information above
will be printed into a FITS table.
-You can also pipe it into @ref{Table} to select files based on certain
properties, to sort them based on another property, or any other operation that
can be done with Table (including @ref{Column arithmetic}).
-For example, with the command below, you can select all the files that have a
size larger than 500 pixels in both dimensions.
-@example
-$ astfits *.fits --keyvalue=NAXIS,NAXIS1,NAXIS2 \
- --colinfoinstdout \
- | asttable --range=NAXIS1,500,inf \
- --range=NAXIS2,500,inf -cFILENAME
-image-a.fits
-image-b.fits
-@end example
-Note that @option{--colinfoinstdout} is necessary to use column names when
piping to other programs (like @command{asttable} above).
-Also, with the @option{-cFILENAME} option, we are asking Table to only print
the final file names (we do not need the sizes any more).
-
-The commands with multiple files above used @file{*.fits}, which is only
useful when all your FITS files are in the same directory.
-However, in many cases, your FITS files will be scattered in multiple
sub-directories of a certain top-level directory, or you may only want those
with more particular file name patterns.
-A more powerful way to list the input files to @option{--keyvalue} is to use
the @command{find} program in Unix-like operating systems.
-For example, with the command below you can search all the FITS files in all
the sub-directories of @file{/TOP/DIR}.
+@node Data containers, Data manipulation, Common program behavior, Top
+@chapter Data containers
-@example
-astfits $(find /TOP/DIR/ -name "*.fits") --keyvalue=NAXIS2
-@end example
+@cindex File operations
+@cindex Operations on files
+@cindex General file operations
+The most low-level and basic property of a dataset is how it is stored.
+To process, archive and transmit the data, you need a container to store it
first.
+From the start of the computer age, different formats have been defined to
store data, optimized for particular applications.
+One format/container can never be useful for all applications: the storage
defines the application and vice-versa.
+In astronomy, the Flexible Image Transport System (FITS) standard has become
the most common format of data storage and transmission.
+It has many useful features, for example, multiple sub-containers (also known
as extensions or header data units, HDUs) within one file, or support for
tables as well as images.
+Each HDU can store an independent dataset and its corresponding meta-data.
+Therefore, Gnuastro has one program (see @ref{Fits}) specifically designed to
manipulate FITS HDUs and the meta-data (header keywords) in each HDU.
-@item -O
-@itemx --colinfoinstdout
-Print column information (or metadata) above the column values when writing
keyword values to standard output with @option{--keyvalue}.
-You can read this option as column-information-in-standard-output.
-@end table
+Your astronomical research does not just involve data analysis (where the FITS
format is very useful).
+For example, you want to demonstrate your raw and processed FITS images or
spectra as figures within slides, reports, or papers.
+The FITS format is not defined for such applications.
+Thus, Gnuastro also comes with the ConvertType program (see @ref{ConvertType})
which can be used to convert a FITS image to and from (where possible) other
formats like plain text and JPEG (which allow two way conversion), along with
EPS and PDF (which can only be created from FITS, not the other way round).
-Below we will discuss the options that can be used to manipulate keywords.
-To see the full list of keywords in a FITS HDU, you can use the
@option{--printallkeys} option.
-If any of the keyword modification options below are requested (for example,
@option{--update}), the headers of the input file/HDU will be changed first,
then printed.
-Keyword modification is done within the input file.
-Therefore, if you want to keep the original FITS file or HDU intact, it is
easiest to create a copy of the file/HDU first and then run Fits on that (for
copying a HDU to another file, see @ref{HDU information and manipulation}.
-In the FITS standard, keywords are always uppercase.
-So case does not matter in the input or output keyword names you specify.
+Finally, the FITS format is not just for images, it can also store tables.
+Binary tables in particular can be very efficient in storing catalogs that
have more than a few tens of columns and rows.
+However, unlike images (where all elements/pixels have one data type), tables
contain multiple columns and each column can have different properties:
independent data types (see @ref{Numeric data types}) and meta-data.
+In practice, each column can be viewed as a separate container that is grouped
with others in the table.
+The only shared property of the columns in a table is thus the number of
elements they contain.
+To allow easy inspection/manipulation of table columns, Gnuastro has the Table
program (see @ref{Table}).
+It can be used to select certain table columns in a FITS table and see them as
a human readable output on the command-line, or to save them into another plain
text or FITS table.
-@cartouche
-@noindent
-@strong{@code{CHECKSUM} automatically updated, when present:} the keyword
modification options will change the contents of the HDU.
-Therefore, if a @code{CHECKSUM} is present in the HDU, after all the keyword
modification options have been complete, Fits will also update @code{CHECKSUM}
before closing the file.
-@end cartouche
+@menu
+* Fits:: View and manipulate extensions and keywords.
+* ConvertType:: Convert data to various formats.
+* Table:: Read and Write FITS tables to plain text.
+* Query:: Import data from external databases.
+@end menu
-Most of the options can accept multiple instances in one command.
-For example, you can add multiple keywords to delete by calling
@option{--delete} multiple times, since repeated keywords are allowed, you can
even delete the same keyword multiple times.
-The action of such options will start from the top most keyword.
-The precedence of operations are described below.
-Note that while the order within each class of actions is preserved, the order
of individual actions is not.
-So irrespective of what order you called @option{--delete} and
@option{--update}.
-First, all the delete operations are going to take effect then the update
operations.
-@enumerate
-@item
-@option{--delete}
-@item
-@option{--rename}
-@item
-@option{--update}
-@item
-@option{--write}
-@item
-@option{--asis}
-@item
-@option{--history}
-@item
-@option{--comment}
-@item
-@option{--date}
-@item
-@option{--printallkeys}
-@item
-@option{--verify}
-@item
-@option{--copykeys}
-@end enumerate
-@noindent
-All possible syntax errors will be reported before the keywords are actually
written.
-FITS errors during any of these actions will be reported, but Fits will not
stop until all the operations are complete.
-If @option{--quitonerror} is called, then Fits will immediately stop upon the
first error.
-@cindex GNU Grep
-If you want to inspect only a certain set of header keywords, it is easiest to
pipe the output of the Fits program to GNU Grep.
-Grep is a very powerful and advanced tool to search strings which is precisely
made for such situations.
-for example, if you only want to check the size of an image FITS HDU, you can
run:
-@example
-$ astfits input.fits | grep NAXIS
-@end example
-@cartouche
-@noindent
-@strong{FITS STANDARD KEYWORDS:}
-Some header keywords are necessary for later operations on a FITS file, for
example, BITPIX or NAXIS, see the FITS standard for their full list.
-If you modify (for example, remove or rename) such keywords, the FITS file
extension might not be usable any more.
-Also be careful for the world coordinate system keywords, if you modify or
change their values, any future world coordinate system (like RA and Dec)
measurements on the image will also change.
-@end cartouche
+@node Fits, ConvertType, Data containers, Data containers
+@section Fits
+@cindex Vatican library
+The ``Flexible Image Transport System'', or FITS, is by far the most common
data container format in astronomy and in constant use since the 1970s.
+Archiving (future usage, simplicity) has been one of the primary design
principles of this format.
+In the last few decades it has proved so useful and robust that the Vatican
Library has also chosen FITS for its ``long-term digital preservation''
project@footnote{@url{https://www.vaticanlibrary.va/home.php?pag=progettodigit}}.
-@noindent
-The keyword related options to the Fits program are fully described below.
-@table @option
+@cindex IAU, international astronomical union
+Although the full name of the standard invokes the idea that it is only for
images, it also contains complete and robust features for tables.
+It started off in the 1970s and was formally published as a standard in 1981,
it was adopted by the International Astronomical Union (IAU) in 1982 and an IAU
working group to maintain its future was defined in 1988.
+The FITS 2.0 and 3.0 standards were approved in 2000 and 2008 respectively,
and the 4.0 draft has also been released recently, please see the
@url{https://fits.gsfc.nasa.gov/fits_standard.html, FITS standard document web
page} for the full text of all versions.
+Also see the @url{https://doi.org/10.1051/0004-6361/201015362, FITS 3.0
standard paper} for a nice introduction and history along with the full
standard.
-@item -d STR
-@itemx --delete=STR
-Delete one instance of the @option{STR} keyword from the FITS header.
-Multiple instances of @option{--delete} can be given (possibly even for the
same keyword, when its repeated in the meta-data).
-All keywords given will be removed from the headers in the same given order.
-If the keyword does not exist, Fits will give a warning and return with a
non-zero value, but will not stop.
-To stop as soon as an error occurs, run with @option{--quitonerror}.
+@cindex Meta-data
+Many common image formats, for example, a JPEG, only have one image/dataset
per file, however one great advantage of the FITS standard is that it allows
you to keep multiple datasets (images or tables along with their separate
meta-data) in one file.
+In the FITS standard, each data + metadata is known as an extension, or more
formally a header data unit or HDU.
+The HDUs in a file can be completely independent: you can have multiple images
of different dimensions/sizes or tables as separate extensions in one file.
+However, while the standard does not impose any constraints on the relation
between the datasets, it is strongly encouraged to group data that are
contextually related with each other in one file.
+For example, an image and the table/catalog of objects and their measured
properties in that image.
+Other examples can be images of one patch of sky in different colors
(filters), or one raw telescope image along with its calibration data (tables
or images).
-@item -r STR,STR
-@itemx --rename=STR,STR
-Rename a keyword to a new value (for example,
@option{--rename=OLDNAME,NEWNAME}.
-@option{STR} contains both the existing and new names, which should be
separated by either a comma (@key{,}) or a space character.
-Note that if you use a space character, you have to put the value to this
option within double quotation marks (@key{"}) so the space character is not
interpreted as an option separator.
-Multiple instances of @option{--rename} can be given in one command.
-The keywords will be renamed in the specified order.
-If the keyword does not exist, Fits will give a warning and return with a
non-zero value, but will not stop.
-To stop as soon as an error occurs, run with @option{--quitonerror}.
+As discussed above, the extensions in a FITS file can be completely
independent.
+To keep some information (meta-data) about the group of extensions in the FITS
file, the community has adopted the following convention: put no data in the
first extension, so it is just meta-data.
+This extension can thus be used to store Meta-data regarding the whole file
(grouping of extensions).
+Subsequent extensions may contain data along with their own separate meta-data.
+All of Gnuastro's programs also follow this convention: the main output
dataset(s) are placed in the second (or later) extension(s).
+The first extension contains no data the program's configuration (input file
name, along with all its option values) are stored as its meta-data, see
@ref{Output FITS files}.
-@item -u STR
-@itemx --update=STR
-Update a keyword, its value, its comments and its units in the format
described below.
-If there are multiple instances of the keyword in the header, they will be
changed from top to bottom (with multiple @option{--update} options).
+The meta-data contain information about the data, for example, which region of
the sky an image corresponds to, the units of the data, what telescope, camera,
and filter the data were taken with, it observation date, or the software that
produced it and its configuration.
+Without the meta-data, the raw dataset is practically just a collection of
numbers and really hard to understand, or connect with the real world (other
datasets).
+It is thus strongly encouraged to supplement your data (at any level of
processing) with as much meta-data about your processing/science as possible.
-@noindent
-The format of the values to this option can best be specified with an
-example:
+The meta-data of a FITS file is in ASCII format, which can be easily viewed or
edited with a text editor or on the command-line.
+Each meta-data element (known as a keyword generally) is composed of a name,
value, units and comments (the last two are optional).
+For example, below you can see three FITS meta-data keywords for specifying
the world coordinate system (WCS, or its location in the sky) of a dataset:
@example
---update=KEYWORD,value,"comments for this keyword",unit
+LATPOLE = -27.805089 / [deg] Native latitude of celestial pole
+RADESYS = 'FK5' / Equatorial coordinate system
+EQUINOX = 2000.0 / [yr] Equinox of equatorial coordinates
@end example
-If there is a writing error, Fits will give a warning and return with a
non-zero value, but will not stop.
-To stop as soon as an error occurs, run with @option{--quitonerror}.
+However, there are some limitations which discourage viewing/editing the
keywords with text editors.
+For example, there is a fixed length of 80 characters for each keyword (its
name, value, units and comments) and there are no new-line characters, so on a
text editor all the keywords are seen in one line.
+Also, the meta-data keywords are immediately followed by the data which are
commonly in binary format and will show up as strange looking characters on a
text editor, and significantly slowing down the processor.
-@noindent
-The value can be any numerical or string value@footnote{Some tricky situations
arise with values like `@command{87095e5}', if this was intended to be a number
it will be kept in the header as @code{8709500000} and there is no problem.
-But this can also be a shortened Git commit hash.
-In the latter case, it should be treated as a string and stored as it is
written.
-Commit hashes are very important in keeping the history of a file during your
research and such values might arise without you noticing them in your
reproduction pipeline.
-One solution is to use @command{git describe} instead of the short hash alone.
-A less recommended solution is to add a space after the commit hash and Fits
will write the value as `@command{87095e5 }' in the header.
-If you later compare the strings on the shell, the space character will be
ignored by the shell in the latter solution and there will be no problem.}.
-Other than the @code{KEYWORD}, all the other values are optional.
-To leave a given token empty, follow the preceding comma (@key{,}) immediately
with the next.
-If any space character is present around the commas, it will be considered
part of the respective token.
-So if more than one token has space characters within it, the safest method to
specify a value to this option is to put double quotation marks around each
individual token that needs it.
-Note that without double quotation marks, space characters will be seen as
option separators and can lead to undefined behavior.
+Gnuastro's Fits program was designed to allow easy manipulation of FITS
extensions and meta-data keywords on the command-line while conforming fully
with the FITS standard.
+For example, you can copy or cut (copy and remove) HDUs/extensions from one
FITS file to another, or completely delete them.
+It also has features to delete, add, or edit meta-data keywords within one HDU.
-@item -w STR
-@itemx --write=STR
-Write a keyword to the header.
-For the possible value input formats, comments and units for the keyword, see
the @option{--update} option above.
-The special names (first string) below will cause a special behavior:
+@menu
+* Invoking astfits:: Arguments and options to Header.
+@end menu
-@table @option
+@node Invoking astfits, , Fits, Fits
+@subsection Invoking Fits
-@item /
-Write a ``title'' to the list of keywords.
-A title consists of one blank line and another which is blank for several
spaces and starts with a slash (@key{/}).
-The second string given to this option is the ``title'' or string printed
after the slash.
-For example, with the command below you can add a ``title'' of `My keywords'
after the existing keywords and add the subsequent @code{K1} and @code{K2}
keywords under it (note that keyword names are not case sensitive).
+Fits can print or manipulate the FITS file HDUs (extensions), meta-data
keywords in a given HDU.
+The executable name is @file{astfits} with the following general template
@example
-$ astfits test.fits -h1 --write=/,"My keywords" \
- --write=k1,1.23,"My first keyword" \
- --write=k2,4.56,"My second keyword"
-$ astfits test.fits -h1
-[[[ ... truncated ... ]]]
-
- / My keywords
-K1 = 1.23 / My first keyword
-K2 = 4.56 / My second keyword
-END
+$ astfits [OPTION...] ASTRdata
@end example
-Adding a ``title'' before each contextually separate group of header keywords
greatly helps in readability and visual inspection of the keywords.
-So generally, when you want to add new FITS keywords, it is good practice to
also add a title before them.
-
-The reason you need to use @key{/} as the keyword name for setting a title is
that @key{/} is the first non-white character.
-The title(s) is(are) written into the FITS with the same order that
@option{--write} is called.
-Therefore in one run of the Fits program, you can specify many different
titles (with their own keywords under them).
-For example, the command below that builds on the previous example and adds
another group of keywords named @code{A1} and @code{A2}.
+@noindent
+One line examples:
@example
-$ astfits test.fits -h1 --write=/,"My keywords" \
- --write=k1,1.23,"My first keyword" \
- --write=k2,4.56,"My second keyword" \
- --write=/,"My second group of keywords" \
- --write=a1,7.89,"First keyword" \
- --write=a2,0.12,"Second keyword"
-@end example
-
-@item checksum
-@cindex CFITSIO
-@cindex @code{DATASUM}: FITS keyword
-@cindex @code{CHECKSUM}: FITS keyword
-When nothing is given afterwards, the header integrity keywords @code{DATASUM}
and @code{CHECKSUM} will be calculated and written/updated.
-The calculation and writing is done fully by CFITSIO, therefore they comply
with the FITS standard
4.0@footnote{@url{https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf}}
that defines these keywords (its Appendix J).
+## View general information about every extension:
+$ astfits image.fits
-If a value is given (e.g., @option{--write=checksum,MyOwnCheckSum}), then
CFITSIO will not be called to calculate these two keywords and the value (as
well as possible comment and unit) will be written just like any other keyword.
-This is generally not recommended since @code{CHECKSUM} is a reserved FITS
standard keyword.
-If you want to calculate the checksum with another hashing standard manually
and write it into the header, it is recommended to use another keyword name.
+## Print the header keywords in the second HDU (counting from 0):
+$ astfits image.fits -h1
-In the FITS standard, @code{CHECKSUM} depends on the HDU's data @emph{and}
header keywords, it will therefore not be valid if you make any further changes
to the header after writing the @code{CHECKSUM} keyword.
-This includes any further keyword modification options in the same call to the
Fits program.
-However, @code{DATASUM} only depends on the data section of the HDU/extension,
so it is not changed when you add, remove or update the header keywords.
-Therefore, it is recommended to write these keywords as the last keywords that
are written/modified in the extension.
-You can use the @option{--verify} option (described below) to verify the
values of these two keywords.
+## Only print header keywords that contain `NAXIS':
+$ astfits image.fits -h1 | grep NAXIS
-@item datasum
-Similar to @option{checksum}, but only write the @code{DATASUM} keyword (that
does not depend on the header keywords, only the data).
-@end table
+## Only print the WCS standard PC matrix elements
+$ astfits image.fits -h1 | grep 'PC._.'
-@item -a STR
-@itemx --asis=STR
-Write the given @code{STR} @emph{exactly} as it is, into the given FITS file
header with no modifications.
-If the contents of @code{STR} does not conform to the FITS standard for
keywords, then it may (most probably: it will!) corrupt your file and you may
not be able to open it any more.
-So please be @strong{very careful} with this option (its your responsibility
to make sure that the string conforms with the FITS standard for keywords).
+## Copy a HDU from input.fits to out.fits:
+$ astfits input.fits --copy=hdu-name --output=out.fits
-If you want to define the keyword from scratch, it is best to use the
@option{--write} option (see below) and let CFITSIO worry about complying with
the FITS standard.
-Also, you want to copy keywords from one FITS file to another, you can use
@option{--copykeys} that is described below.
-Through these high-level instances, you don't have to worry about low-level
issues.
+## Update the OLDKEY keyword value to 153.034:
+$ astfits --update=OLDKEY,153.034,"Old keyword comment"
-One common usage of @option{--asis} occurs when you are given the contents of
a FITS header (many keywords) as a plain-text file (so the format of each
keyword line conforms with the FITS standard, just the file is plain-text, and
you have one keyword per line when you open it in a plain-text editor).
-In that case, Gnuastro's Fits program won't be able to parse it (it doesn't
conform to the FITS standard, which doesn't have a new-line character!).
-With the command below, you can insert those headers in @file{headers.txt}
into @file{img.fits} (its HDU number 1, the default; you can change the HDU to
modify with @option{--hdu}).
+## Delete one COMMENT keyword and add a new one:
+$ astfits --delete=COMMENT --comment="Anything you like ;-)."
-@example
-$ cat headers.txt \
- | while read line; do \
- astfits img.fits --asis="$line"; \
- done
+## Write two new keywords with different values and comments:
+$ astfits --write=MYKEY1,20.00,"An example keyword" --write=MYKEY2,fd
+
+## Inspect individual pixel area taken based on its WCS (in degree^2).
+## Then convert the area to arcsec^2 with the Arithmetic program.
+$ astfits input.fits --pixelareaonwcs -o pixarea.fits
+$ astarithmetic pixarea.fits 3600 3600 x x -o pixarea_arcsec2.fits
@end example
-@cartouche
-@noindent
-@strong{Don't forget a title:} Since the newly added headers in the example
above weren't originally in the file, they are probably some form of high-level
metadata.
-The raw example above will just append the new keywords after the last one.
-Making it hard for human readability (its not clear what this new group of
keywords signify, where they start, and where this group of keywords end).
-To help the human readability of the header, add a title for this group of
keywords before writing them.
-To do that, run the following command before the @command{cat ...} command
above (replace @code{Imported keys} with any title that best describes this
group of new keywords based on their context):
+@cindex HDU
+@cindex HEALPix
+When no action is requested (and only a file name is given), Fits will print a
list of information about the extension(s) in the file.
+This information includes the HDU number, HDU name (@code{EXTNAME} keyword),
type of data (see @ref{Numeric data types}, and the number of data elements it
contains (size along each dimension for images and table rows and columns).
+Optionally, a comment column is printed for special situations (like a 2D
HEALPix grid that is usually stored as a 1D dataset/table).
+You can use this to get a general idea of the contents of the FITS file and
what HDU to use for further processing, either with the Fits program or any
other Gnuastro program.
+
+Here is one example of information about a FITS file with four extensions: the
first extension has no data, it is a purely meta-data HDU (commonly used to
keep meta-data about the whole file, or grouping of extensions, see @ref{Fits}).
+The second extension is an image with name @code{IMAGE} and single precision
floating point type (@code{float32}, see @ref{Numeric data types}), it has 4287
pixels along its first (horizontal) axis and 4286 pixels along its second
(vertical) axis.
+The third extension is also an image with name @code{MASK}.
+It is in 2-byte integer format (@code{int16}) which is commonly used to keep
information about pixels (for example, to identify which ones were saturated,
or which ones had cosmic rays and so on), note how it has the same size as the
@code{IMAGE} extension.
+The third extension is a binary table called @code{CATALOG} which has 12371
rows and 5 columns (it probably contains information about the sources in the
image).
+
@example
-$ astfits img.fits --write=/,"Imported keys"
+GNU Astronomy Utilities X.X
+Run on Day Month DD HH:MM:SS YYYY
+-----
+HDU (extension) information: `image.fits'.
+ Column 1: Index (counting from 0).
+ Column 2: Name (`EXTNAME' in FITS standard).
+ Column 3: Image data type or `table' format (ASCII or binary).
+ Column 4: Size of data in HDU.
+-----
+0 n/a uint8 0
+1 IMAGE float32 4287x4286
+2 MASK int16 4287x4286
+3 CATALOG table_binary 12371x5
@end example
-@end cartouche
-@item -H STR
-@itemx --history STR
-Add a @code{HISTORY} keyword to the header with the given value. A new
@code{HISTORY} keyword will be created for every instance of this option. If
the string given to this option is longer than 70 characters, it will be
separated into multiple keyword cards. If there is an error, Fits will give a
warning and return with a non-zero value, but will not stop. To stop as soon as
an error occurs, run with @option{--quitonerror}.
-
-@item -c STR
-@itemx --comment STR
-Add a @code{COMMENT} keyword to the header with the given value.
-Similar to the explanation for @option{--history} above.
+If a specific HDU is identified on the command-line with the @option{--hdu}
(or @option{-h} option) and no operation requested, then the full list of
header keywords in that HDU will be printed (as if the @option{--printallkeys}
was called, see below).
+It is important to remember that this only occurs when @option{--hdu} is given
on the command-line.
+The @option{--hdu} value given in a configuration file will only be used when
a specific operation on keywords requested.
+Therefore as described in the paragraphs above, when no explicit call to the
@option{--hdu} option is made on the command-line and no operation is requested
(on the command-line or configuration files), the basic information of each
HDU/extension is printed.
-@item -t
-@itemx --date
-Put the current date and time in the header.
-If the @code{DATE} keyword already exists in the header, it will be updated.
-If there is a writing error, Fits will give a warning and return with a
non-zero value, but will not stop.
-To stop as soon as an error occurs, run with @option{--quitonerror}.
+The operating mode and input/output options to Fits are similar to the other
programs and fully described in @ref{Common options}.
+The options particular to Fits can be divided into three groups:
+1) those related to modifying HDUs or extensions (see @ref{HDU information and
manipulation}), and
+2) those related to viewing/modifying meta-data keywords (see @ref{Keyword
inspection and manipulation}).
+3) those related to creating meta-images where each pixel shows values for a
specific property of the image (see @ref{Pixel information images}).
+These three classes of options cannot be called together in one run: you can
either work on the extensions, meta-data keywords in any instance of Fits, or
create meta-images where each pixel shows a particular information about the
image itself.
-@item -p
-@itemx --printallkeys
-Print the full metadata (keywords, values, units and comments) in the
specified FITS extension (HDU).
-If this option is called along with any of the other keyword editing commands,
as described above, all other editing commands take precedence to this.
-Therefore, it will print the final keywords after all the editing has been
done.
+@menu
+* HDU information and manipulation:: Learn about the HDUs and move them.
+* Keyword inspection and manipulation:: Manipulate metadata keywords in a HDU.
+* Pixel information images:: Pixel values contain information on the pixels.
+@end menu
-@item --printkeynames
-Print only the keyword names of the specified FITS extension (HDU), one line
per name.
-This option must be called alone.
-@item -v
-@itemx --verify
-Verify the @code{DATASUM} and @code{CHECKSUM} data integrity keywords of the
FITS standard.
-See the description under the @code{checksum} (under @option{--write}, above)
for more on these keywords.
-This option will print @code{Verified} for both keywords if they can be
verified.
-Otherwise, if they do not exist in the given HDU/extension, it will print
@code{NOT-PRESENT}, and if they cannot be verified it will print
@code{INCORRECT}.
-In the latter case (when the keyword values exist but cannot be verified), the
Fits program will also return with a failure.
-By default this function will also print a short description of the
@code{DATASUM} AND @code{CHECKSUM} keywords.
-You can suppress this extra information with @code{--quiet} option.
-@item --copykeys=INT:INT/STR,STR[,STR]
-Copy the desired set of the input's keyword records, to the to the output
(specified with the @option{--output} and @option{--outhdu} for the filename
and HDU/extension respectively).
-The keywords to copy can be given either as a range (in the format of
@code{INT:INT}, inclusive) or a list of keyword names as comma-separated
strings (@code{STR,STR}), the list can have any number of keyword names.
-More details and examples of the two forms are given below:
+@node HDU information and manipulation, Keyword inspection and manipulation,
Invoking astfits, Invoking astfits
+@subsubsection HDU information and manipulation
+Each FITS file header data unit, or HDU (also known as an extension) is an
independent dataset (data + meta-data).
+Multiple HDUs can be stored in one FITS file, see @ref{Fits}.
+The general HDU-related options to the Fits program are listed below as two
general classes:
+the first group below focus on HDU information while the latter focus on
manipulating (moving or deleting) the HDUs.
-@table @asis
-@item Range
-The given string to this option must be two integers separated by a colon
(@key{:}).
-The first integer must be positive (counting of the keyword records starts
from 1).
-The second integer may be negative (zero is not acceptable) or an integer
larger than the first.
+The options below print information about the given HDU on the command-line.
+Thus they cannot be called together in one command (each has its own
independent output).
-A negative second integer means counting from the end.
-So @code{-1} is the last copy-able keyword (not including the @code{END}
keyword).
+@table @option
+@item -n
+@itemx --numhdus
+Print the number of extensions/HDUs in the given file.
+Note that this option must be called alone and will only print a single number.
+It is thus useful in scripts, for example, when you need to do check the
number of extensions in a FITS file.
-To see the header keywords of the input with a number before them, you can
pipe the output of the Fits program (when it prints all the keywords in an
extension) into the @command{cat} program like below:
+For a complete list of basic meta-data on the extensions in a FITS file, do
not use any of the options in this section or in @ref{Keyword inspection and
manipulation}.
+For more, see @ref{Invoking astfits}.
-@example
-$ astfits input.fits -h1 | cat -n
-@end example
+@item --hastablehdu
+Print @code{1} (on standard output) if at least one table HDU (ASCII or
binary) exists in the FITS file.
+Otherwise (when no table HDU exists in the file), print @code{0}.
-@item List of names
-The given string to this option must be a comma separated list of keyword
names.
-For example, see the command below:
+@item --listtablehdus
+Print the names or numbers (when a name does not exist, counting from zero) of
HDUs that contain a table (ASCII or Binary) on standard output, one per line.
+Otherwise (when no table HDU exists in the file) nothing will be printed.
-@example
-$ astfits input.fits -h1 --copykeys=KEY1,KEY2 \
- --output=output.fits --outhdu=1
-@end example
+@item --hasimagehdu
+Print @code{1} (on standard output) if at least one image HDU exists in the
FITS file.
+Otherwise (when no image HDU exists in the file), print @code{0}.
-Please consider the notes below when copying keywords with names:
-@itemize
-@item
-If the number of characters in the name is more than 8, CFITSIO will place a
@code{HIERARCH} before it.
-In this case simply give the name and do not give the @code{HIERARCH} (which
is a constant and not considered part of the keyword name).
-@item
-If your keyword name is composed only of digits, do not give it as the first
name given to @option{--copykeys}.
-Otherwise, it will be confused with the range format above.
-You can safely give an only-digit keyword name as the second, or third
requested keywords.
-@item
-If the keyword is repeated more than once in the header, currently only the
first instance will be copied.
-In other words, even if you call @option{--copykeys} multiple times with the
same keyword name, its first instance will be copied.
-If you need to copy multiple instances of the same keyword, please get in
touch with us at @code{bug-gnuastro@@gnu.org}.
-@end itemize
+In the FITS standard, any array with any dimensions is called an ``image'',
therefore this option includes 1, 3 and 4 dimensional arrays too.
+However, an image HDU with zero dimensions (which is usually the first
extension and only contains metadata) is not counted here.
-@end table
+@item --listimagehdus
+Print the names or numbers (when a name does not exist, counting from zero) of
HDUs that contain an image on standard output, one per line.
+Otherwise (when no image HDU exists in the file) nothing will be printed.
-@item --outhdu
-The HDU/extension to write the output keywords of @option{--copykeys}.
+In the FITS standard, any array with any dimensions is called an ``image'',
therefore this option includes 1, 3 and 4 dimensional arrays too.
+However, an image HDU with zero dimensions (which is usually the first
extension and only contains metadata) is not counted here.
-@item -Q
-@itemx --quitonerror
-Quit if any of the operations above are not successful.
-By default if an error occurs, Fits will warn the user of the faulty keyword
and continue with the rest of actions.
+@item --listallhdus
+Print the names or numbers (when a name does not exist, counting from zero) of
all HDUs within the input file on the standard output, one per line.
-@item -s STR
-@itemx --datetosec STR
-@cindex Unix epoch time
-@cindex Time, Unix epoch
-@cindex Epoch, Unix time
-Interpret the value of the given keyword in the FITS date format (most
generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}) and return the corresponding Unix
epoch time (number of seconds that have passed since 00:00:00 Thursday, January
1st, 1970).
-The @code{Thh:mm:ss.ddd...} section (specifying the time of day), and also the
@code{.ddd...} (specifying the fraction of a second) are optional.
-The value to this option must be the FITS keyword name that contains the
requested date, for example, @option{--datetosec=DATE-OBS}.
+@item --pixelscale
+Print the HDU's pixel-scale (change in world coordinate for one pixel along
each dimension) and pixel area or voxel volume.
+Without the @option{--quiet} option, the output of @option{--pixelscale} has
multiple lines and explanations, thus being more human-friendly.
+It prints the file/HDU name, number of dimensions, and the units along with
the actual pixel scales.
+Also, when any of the units are in degrees, the pixel scales and area/volume
are also printed in units of arc-seconds.
+For 3D datasets, the pixel area (on each 2D slice of the 3D cube) is printed
as well as the voxel volume.
+If you only want the pixel area of a 2D image in units of arcsec@mymath{^2}
you can use @option{--pixelareaarcsec2} described below.
-@cindex GNU C Library
-This option can also interpret the older FITS date format
(@code{DD/MM/YYThh:mm:ss.ddd...}) where only two characters are given to the
year.
-In this case (following the GNU C Library), this option will make the
following assumption: values 68 to 99 correspond to the years 1969 to 1999, and
values 0 to 68 as the years 2000 to 2068.
+However, in scripts (that are to be run automatically), this human-friendly
format is annoying, so when called with the @option{--quiet} option, only the
pixel-scale value(s) along each dimension is(are) printed in one line.
+These numbers are followed by the pixel area (in the raw WCS units).
+For 3D datasets, this will be area on each 2D slice.
+Finally, for 3D datasets, a final number (the voxel volume) is printed.
+As a summary, in @option{--quiet} mode, for 2D datasets three numbers are
printed and for 3D datasets, 5 numbers are printed.
+If the dataset has more than 3 dimensions, only the pixel-scale values are
printed (no area or volume will be printed).
-This is a very useful option for operations on the FITS date values, for
example, sorting FITS files by their dates, or finding the time difference
between two FITS files.
-The advantage of working with the Unix epoch time is that you do not have to
worry about calendar details (for example, the number of days in different
months, or leap years).
+@item --pixelareaarcsec2
+Print the HDU's pixel area in units of arcsec@mymath{^2}.
+This option only works on 2D images, that have WCS coordinates in units of
degrees.
+For lower-level information about the pixel scale in each dimension, see
@option{--pixelscale} (described above).
-@item --wcscoordsys=STR
-@cindex Galactic coordinate system
-@cindex Ecliptic coordinate system
-@cindex Equatorial coordinate system
-@cindex Supergalactic coordinate system
-@cindex Coordinate system: Galactic
-@cindex Coordinate system: Ecliptic
-@cindex Coordinate system: Equatorial
-@cindex Coordinate system: Supergalactic
-Convert the coordinate system of the image's world coordinate system (WCS) to
the given coordinate system (@code{STR}) and write it into the file given to
@option{--output} (or an automatically named file if no @option{--output} has
been given).
+@item --skycoverage
+@cindex Image's sky coverage
+@cindex Coverage of image over sky
+Print the rectangular area (or 3D cube) covered by the given image/datacube
HDU over the Sky in the WCS units.
+The covered area is reported in two ways:
+1) the center and full width in each dimension,
+2) the minimum and maximum sky coordinates in each dimension.
+This is option is thus useful when you want to get a general feeling of a new
image/dataset, or prepare the inputs to query external databases in the region
of the image (for example, with @ref{Query}).
-For example, with the command below, @file{img-eq.fits} will have an identical
dataset (pixel values) as @file{image.fits}.
-However, the WCS coordinate system of @file{img-eq.fits} will be the
equatorial coordinate system in the Julian calendar epoch 2000 (which is the
most common epoch used today).
-Fits will automatically extract the current coordinate system of
@file{image.fits} and as long as it is one of the recognized coordinate systems
listed below, it will do the conversion.
+If run without the @option{--quiet} option, the values are given with a
human-friendly description.
+For example, here is the output of this option on an image taken near the star
Castor:
@example
-$ astfits image.fits --wcscoordsys=eq-j2000 \
- --output=img-eq.fits
+$ astfits castor.fits --skycoverage
+Input file: castor.fits (hdu: 1)
+
+Sky coverage by center and (full) width:
+ Center: 113.9149075 31.93759664
+ Width: 2.41762045 2.67945253
+
+Sky coverage by range along dimensions:
+ RA 112.7235592 115.1411797
+ DEC 30.59262123 33.27207376
@end example
-The currently recognized coordinate systems are listed below (the most common
one today is @code{eq-j2000}):
+With the @option{--quiet} option, the values are more machine-friendly (easy
to parse).
+It has two lines, where the first line contains the center/width values and
the second line shows the coordinate ranges in each dimension.
-@table @code
-@item eq-j2000
-2000.0 (Julian-year) equatorial coordinates.
-@item eq-b1950
-1950.0 (Besselian-year) equatorial coordinates.
-@item ec-j2000
-2000.0 (Julian-year) ecliptic coordinates.
-@item ec-b1950
-1950.0 (Besselian-year) ecliptic coordinates.
-@item galactic
-Galactic coordinates.
-@item supergalactic
-Supergalactic coordinates.
-@end table
-
-The Equatorial and Ecliptic coordinate systems are defined by the mean equator
and equinox epoch: either the Besselian year 1950.0, or the Julian year 2000.
-For more on their difference and links for further reading about epochs in
astronomy, please see the description in
@url{https://en.wikipedia.org/wiki/Epoch_(astronomy), Wikipedia}.
-
-@item --wcsdistortion=STR
-@cindex WCS distortion
-@cindex Distortion, WCS
-@cindex SIP WCS distortion
-@cindex TPV WCS distortion
-If the argument has a WCS distortion, the output (file given with the
@option{--output} option) will have the distortion given to this option (for
example, @code{SIP}, @code{TPV}).
-The output will be a new file (with a copy of the image, and the new WCS), so
if it already exists, the file will be delete (unless you use the
@code{--dontdelete} option, see @ref{Input output options}).
-
-With this option, the Fits program will read the minimal set of keywords from
the input HDU and the HDU data.
-It will then write them into the file given to the @option{--output} option
but with a newly created set of WCS-related keywords corresponding to the
desired distortion standard.
+@example
+$ astfits castor.fits --skycoverage --quiet
+113.9149075 31.93759664 2.41762045 2.67945253
+112.7235592 115.1411797 30.59262123 33.27207376
+@end example
-If no @option{--output} file is specified, an automatically generated output
name will be used which is composed of the input's name but with the
@file{-DDD.fits} suffix, see @ref{Automatic output}.
-Where @file{DDD} is the value given to this option (desired output distortion).
+Note that this is a simple rectangle (cube in 3D) definition, so if the image
is rotated in relation to the celestial coordinates a general polygon is
necessary to exactly describe the coverage.
+Hence when there is rotation, the reported area will be larger than the actual
area containing data, you can visually see the area with the
@option{--pixelareaonwcs} option of @ref{Fits}.
-Note that all possible conversions between all standards are not yet supported.
-If the requested conversion is not supported, an informative error message
will be printed.
-If this happens, please let us know and we will try our best to add the
respective conversions.
+Currently this option only supports images that are less than 180 degrees in
width (which is usually the case!).
+This requirement has been necessary to account for images that cross the RA=0
hour circle on the sky.
+Please get in touch with us at @url{mailto:bug-gnuastro@@gnu.org} if you have
an image that is larger than 180 degrees so we try to find a solution based on
need.
-For example, with the command below, you can be sure that if @file{in.fits}
has a distortion in its WCS, the distortion of @file{out.fits} will be in the
SIP standard.
+@item --datasum
+@cindex @code{DATASUM}: FITS keyword
+Calculate and print the given HDU's "datasum" to stdout.
+The given HDU is specified with the @option{--hdu} (or @option{-h}) option.
+This number is calculated by parsing all the bytes of the given HDU's data
records (excluding keywords).
+This option ignores any possibly existing @code{DATASUM} keyword in the HDU.
+For more on @code{DATASUM} in the FITS standard, see @ref{Keyword inspection
and manipulation} (under the @code{checksum} component of @option{--write}).
-@example
-$ astfits in.fits --wcsdistortion=SIP --output=out.fits
-@end example
+You can use this option to confirm that the data in two different HDUs
(possibly with different keywords) is identical.
+Its advantage over @option{--write=datasum} (which writes the @code{DATASUM}
keyword into the given HDU) is that it does not require write permissions.
@end table
+The following options manipulate (move/delete) the HDUs in one FITS file or to
another FITS file.
+These options may be called multiple times in one run.
+If so, the extensions will be copied from the input FITS file to the output
FITS file in the given order (on the command-line and also in configuration
files, see @ref{Configuration file precedence}).
+If the separate classes are called together in one run of Fits, then first
@option{--copy} is run (on all specified HDUs), followed by @option{--cut}
(again on all specified HDUs), and then @option{--remove} (on all specified
HDUs).
-@node Pixel information images, , Keyword inspection and manipulation,
Invoking astfits
-@subsubsection Pixel information images
-In @ref{Keyword inspection and manipulation} options like
@option{--pixelscale} were introduced for information on the pixels from the
keywords.
-But that only provides a single value for all the pixels!
-This will not be sufficient in some scenarios; for example due to distortion,
different regions of the image will have different pixel areas when projected
onto the sky.
-
-@cindex Meta image
-The options in this section provide such ``meta'' images: images where the
pixel values are information about the pixel itself.
-Such images can be useful in understanding the underlying pixel grid with the
same tools that you study the astronomical objects within the image (like
@ref{SAO DS9}).
-After all, nothing beats visual inspection with tools you are familiar with.
+The @option{--copy} and @option{--cut} options need an output FITS file
(specified with the @option{--output} option).
+If the output file exists, then the specified HDU will be copied following the
last extension of the output file (the existing HDUs in it will be untouched).
+Thus, after Fits finishes, the copied HDU will be the last HDU of the output
file.
+If no output file name is given, then automatic output will be used to store
the HDUs given to this option (see @ref{Automatic output}).
-@table @code
-@item --pixelareaonwcs
-Create a meta-image where each pixel's value shows its area in the WCS units
(usually degrees squared).
-The output is therefore the same size as the input.
+@table @option
-@cindex Pixel mixing
-@cindex Area resampling
-@cindex Resampling by area
-This option uses the same ``pixel mixing'' or ``area resampling'' concept that
is described in @ref{Resampling} (as part of the Warp program).
-Similar to Warp, its sampling can be tuned with the @option{--edgesampling}
that is described below.
+@item -C STR
+@itemx --copy=STR
+Copy the specified extension into the output file, see explanations above.
-@cindex Distortion
-@cindex Area of pixel on sky
-One scenario where this option becomes handy is when you are debugging aligned
images using the Warp program (see @ref{Warp}).
-You may observe gradients after warping and can check if they caused by the
distortion of the instrument or not.
-Such gradients can happen due to distortions because the detectors pixels are
measuring photons from different areas on the sky (or the type of projection
you're seeing).
-This effect is more pronounced in images covering larger portions of the sky,
for instance, the TESS
images@footnote{@url{https://www.nasa.gov/tess-transiting-exoplanet-survey-satellite}}.
+@item -k STR
+@itemx --cut=STR
+Cut (copy to output, remove from input) the specified extension into the
+output file, see explanations above.
-Here is an example usage of the @option{--pixelareaonwcs} option:
+@item -R STR
+@itemx --remove=STR
+Remove the specified HDU from the input file.
-@example
-# Check the area each 'input.fits' pixel takes in sky
-$ astfits input.fits -h1 --pixelareaonwcs -o pixarea.fits
+The first (zero-th) HDU cannot be removed with this option.
+Consider using @option{--copy} or @option{--cut} in combination with
@option{primaryimghdu} to not have an empty zero-th HDU.
+From CFITSIO: ``In the case of deleting the primary array (the first HDU in
the file) then [it] will be replaced by a null primary array containing the
minimum set of required keywords and no data.''.
+So in practice, any existing data (array) and meta-data in the first extension
will be removed, but the number of extensions in the file will not change.
+This is because of the unique position the first FITS extension has in the
FITS standard (for example, it cannot be used to store tables).
-# Convert each pixel's area to arcsec^2
-$ astarithmetic pixarea.fits 3600 3600 x x \
- --output=pixarea_arcsec2.fits
+@item --primaryimghdu
+Copy or cut an image HDU to the zero-th HDU/extension a file that does not yet
exist.
+This option is thus irrelevant if the output file already exists or the
copied/cut extension is a FITS table.
+For example, with the commands below, first we make sure that @file{out.fits}
does not exist, then we copy the first extension of @file{in.fits} to the
zero-th extension of @file{out.fits}.
-# Compare area relative to the actual reported pixel scale
-$ pixarea=$(astfits input.fits --pixelscale -q \
- | awk '@{print $3@}')
-$ astarithmetic pixarea.fits $pixarea / -o pixarea_rel.fits
+@example
+$ rm -f out.fits
+$ astfits in.fits --copy=1 --primaryimghdu --output=out.fits
@end example
-@item --edgesampling=INT
-Extra sampling along the pixel edges for @option{--pixelareaonwcs}.
-The default value is 0, meaning that only the pixel vertices are used.
-Values greater than zero improve the accuracy in the expense of greater time
and memory consumption.
-With that said, the default value of zero usually has a good precision unless
the given image has extreme distortions that produce irregular pixel shapes.
-For more, see @ref{Align pixels with WCS considering distortions}).
-
-@cartouche
-@noindent
-@strong{Caution:} This option does not ``oversample'' the output image!
-Rather, it makes Warp use more points to calculate the @emph{input} pixel area.
-To oversample the output image, set a reasonable @option{--cdelt} value.
-@end cartouche
+If we had not used @option{--primaryimghdu}, then the zero-th extension of
@file{out.fits} would have no data, and its second extension would host the
copied image (just like any other output of Gnuastro).
@end table
+@node Keyword inspection and manipulation, Pixel information images, HDU
information and manipulation, Invoking astfits
+@subsubsection Keyword inspection and manipulation
+The meta-data in each header data unit, or HDU (also known as extension, see
@ref{Fits}) is stored as ``keyword''s.
+Each keyword consists of a name, value, unit, and comments.
+The Fits program (see @ref{Fits}) options related to viewing and manipulating
keywords in a FITS HDU are described below.
+First, let's review the @option{--keyvalue} option which should be called
separately from the rest of the options described in this section.
+Also, unlike the rest of the options in this section, with
@option{--keyvalue}, you can give more than one input file.
+@table @option
+@item -l STR[,STR[,...]
+@itemx --keyvalue=STR[,STR[,...]
+Only print the value of the requested keyword(s): the @code{STR}s.
+@option{--keyvalue} can be called multiple times, and each call can contain
multiple comma-separated keywords.
+If more than one file is given, this option uses the same HDU/extension for
all of them (value to @option{--hdu}).
+For example, you can get the number of dimensions of the three FITS files in
the running directory, as well as the length along each dimension, with this
command:
+@example
+$ astfits *.fits --keyvalue=NAXIS,NAXIS1 --keyvalue=NAXIS2
+image-a.fits 2 774 672
+image-b.fits 2 774 672
+image-c.fits 2 387 336
+@end example
+If only one input is given, and the @option{--quiet} option is activated, the
file name is not printed on the first column, only the values of the requested
keywords.
+@example
+$ astfits image-a.fits --keyvalue=NAXIS,NAXIS1 \
+ --keyvalue=NAXIS2 --quiet
+2 774 672
+@end example
+The output is internally stored (and finally printed) as a table (with one
column per keyword).
+Therefore just like the Table program, you can use @option{--colinfoinstdout}
to print the metadata like the example below (also see @ref{Invoking asttable}).
+The keyword metadata (comments and units) are extracted from the comments and
units of the keyword in the input files (first file that has a comment or unit).
+Hence if the keyword does not have units or comments in any of the input
files, they will be empty.
+For more on Gnuastro's plain-text metadata format, see @ref{Gnuastro text
table format}.
+@example
+$ astfits *.fits --keyvalue=NAXIS,NAXIS1,NAXIS2 \
+ --colinfoinstdout
+# Column 1: FILENAME [name,str10,] Name of input file.
+# Column 2: NAXIS [ ,u8 ,] number of data axes
+# Column 3: NAXIS1 [ ,u16 ,] length of data axis 1
+# Column 4: NAXIS2 [ ,u16 ,] length of data axis 2
+image-a.fits 2 774 672
+image-b.fits 2 774 672
+image-c.fits 2 387 336
+@end example
+Another advantage of a table output is that you can directly write the table
to a file.
+For example, if you add @option{--output=fileinfo.fits}, the information above
will be printed into a FITS table.
+You can also pipe it into @ref{Table} to select files based on certain
properties, to sort them based on another property, or any other operation that
can be done with Table (including @ref{Column arithmetic}).
+For example, with the command below, you can select all the files that have a
size larger than 500 pixels in both dimensions.
+@example
+$ astfits *.fits --keyvalue=NAXIS,NAXIS1,NAXIS2 \
+ --colinfoinstdout \
+ | asttable --range=NAXIS1,500,inf \
+ --range=NAXIS2,500,inf -cFILENAME
+image-a.fits
+image-b.fits
+@end example
+Note that @option{--colinfoinstdout} is necessary to use column names when
piping to other programs (like @command{asttable} above).
+Also, with the @option{-cFILENAME} option, we are asking Table to only print
the final file names (we do not need the sizes any more).
+The commands with multiple files above used @file{*.fits}, which is only
useful when all your FITS files are in the same directory.
+However, in many cases, your FITS files will be scattered in multiple
sub-directories of a certain top-level directory, or you may only want those
with more particular file name patterns.
+A more powerful way to list the input files to @option{--keyvalue} is to use
the @command{find} program in Unix-like operating systems.
+For example, with the command below you can search all the FITS files in all
the sub-directories of @file{/TOP/DIR}.
+@example
+astfits $(find /TOP/DIR/ -name "*.fits") --keyvalue=NAXIS2
+@end example
+@item -O
+@itemx --colinfoinstdout
+Print column information (or metadata) above the column values when writing
keyword values to standard output with @option{--keyvalue}.
+You can read this option as column-information-in-standard-output.
+@end table
+Below we will discuss the options that can be used to manipulate keywords.
+To see the full list of keywords in a FITS HDU, you can use the
@option{--printallkeys} option.
+If any of the keyword modification options below are requested (for example,
@option{--update}), the headers of the input file/HDU will be changed first,
then printed.
+Keyword modification is done within the input file.
+Therefore, if you want to keep the original FITS file or HDU intact, it is
easiest to create a copy of the file/HDU first and then run Fits on that (for
copying a HDU to another file, see @ref{HDU information and manipulation}.
+In the FITS standard, keywords are always uppercase.
+So case does not matter in the input or output keyword names you specify.
+@cartouche
+@noindent
+@strong{@code{CHECKSUM} automatically updated, when present:} the keyword
modification options will change the contents of the HDU.
+Therefore, if a @code{CHECKSUM} is present in the HDU, after all the keyword
modification options have been complete, Fits will also update @code{CHECKSUM}
before closing the file.
+@end cartouche
+Most of the options can accept multiple instances in one command.
+For example, you can add multiple keywords to delete by calling
@option{--delete} multiple times, since repeated keywords are allowed, you can
even delete the same keyword multiple times.
+The action of such options will start from the top most keyword.
+The precedence of operations are described below.
+Note that while the order within each class of actions is preserved, the order
of individual actions is not.
+So irrespective of what order you called @option{--delete} and
@option{--update}.
+First, all the delete operations are going to take effect then the update
operations.
+@enumerate
+@item
+@option{--delete}
+@item
+@option{--rename}
+@item
+@option{--update}
+@item
+@option{--write}
+@item
+@option{--asis}
+@item
+@option{--history}
+@item
+@option{--comment}
+@item
+@option{--date}
+@item
+@option{--printallkeys}
+@item
+@option{--verify}
+@item
+@option{--copykeys}
+@end enumerate
+@noindent
+All possible syntax errors will be reported before the keywords are actually
written.
+FITS errors during any of these actions will be reported, but Fits will not
stop until all the operations are complete.
+If @option{--quitonerror} is called, then Fits will immediately stop upon the
first error.
+@cindex GNU Grep
+If you want to inspect only a certain set of header keywords, it is easiest to
pipe the output of the Fits program to GNU Grep.
+Grep is a very powerful and advanced tool to search strings which is precisely
made for such situations.
+for example, if you only want to check the size of an image FITS HDU, you can
run:
-@node ConvertType, Table, Fits, Data containers
-@section ConvertType
-
-@cindex Data format conversion
-@cindex Converting data formats
-@cindex Image format conversion
-@cindex Converting image formats
-@pindex @r{ConvertType (}astconvertt@r{)}
-The FITS format used in astronomy was defined mainly for archiving,
transmission, and processing.
-In other situations, the data might be useful in other formats.
-For example, when you are writing a paper or report, or if you are making
slides for a talk, you cannot use a FITS image.
-Other image formats should be used.
-In other cases you might want your pixel values in a table format as plain
text for input to other programs that do not recognize FITS.
-ConvertType is created for such situations.
-The various types will increase with future updates and based on need.
+@example
+$ astfits input.fits | grep NAXIS
+@end example
-The conversion is not only one way (from FITS to other formats), but two ways
(except the EPS and PDF formats@footnote{Because EPS and PDF are vector, not
raster/pixelated formats}).
-So you can also convert a JPEG image or text file into a FITS image.
-Basically, other than EPS/PDF, you can use any of the recognized formats as
different color channel inputs to get any of the recognized outputs.
+@cartouche
+@noindent
+@strong{FITS STANDARD KEYWORDS:}
+Some header keywords are necessary for later operations on a FITS file, for
example, BITPIX or NAXIS, see the FITS standard for their full list.
+If you modify (for example, remove or rename) such keywords, the FITS file
extension might not be usable any more.
+Also be careful for the world coordinate system keywords, if you modify or
change their values, any future world coordinate system (like RA and Dec)
measurements on the image will also change.
+@end cartouche
-Before explaining the options and arguments (in @ref{Invoking astconvertt}),
we will start with a short discussion on the difference between raster and
vector graphics in @ref{Raster and Vector graphics}.
-In ConvertType, vector graphics are used to add markers over your originally
rasterized data, producing high quality images, ready to be used in your
exciting papers.
-We will continue with a description of the recognized files types in
@ref{Recognized file formats}, followed a short introduction to digital color
in @ref{Color}.
-A tutorial on how to add markers over an image is then given in @ref{Marking
objects for publication} and we conclude with a @LaTeX{} based solution to add
coordinates over an image.
-@menu
-* Raster and Vector graphics:: Images coming from nature, and the abstract.
-* Recognized file formats:: Recognized file formats
-* Color:: Some explanations on color.
-* Annotations for figure in paper:: Adding coordinates or physical scale.
-* Invoking astconvertt:: Options and arguments to ConvertType.
-@end menu
+@noindent
+The keyword related options to the Fits program are fully described below.
+@table @option
-@node Raster and Vector graphics, Recognized file formats, ConvertType,
ConvertType
-@subsection Raster and Vector graphics
+@item -d STR
+@itemx --delete=STR
+Delete one instance of the @option{STR} keyword from the FITS header.
+Multiple instances of @option{--delete} can be given (possibly even for the
same keyword, when its repeated in the meta-data).
+All keywords given will be removed from the headers in the same given order.
+If the keyword does not exist, Fits will give a warning and return with a
non-zero value, but will not stop.
+To stop as soon as an error occurs, run with @option{--quitonerror}.
-@cindex Raster graphics
-@cindex Graphics (raster)
-Images that are produced by a hardware (for example, the camera in your phone,
or the camera connected to a telescope) provide pixelated data.
-Such data are therefore stored in a
@url{https://en.wikipedia.org/wiki/Raster_graphics, Raster graphics} format
which has discrete, independent, equally spaced data elements.
-For example, this is the format used FITS (see @ref{Fits}), JPEG, TIFF, PNG
and other image formats.
+@item -r STR,STR
+@itemx --rename=STR,STR
+Rename a keyword to a new value (for example,
@option{--rename=OLDNAME,NEWNAME}.
+@option{STR} contains both the existing and new names, which should be
separated by either a comma (@key{,}) or a space character.
+Note that if you use a space character, you have to put the value to this
option within double quotation marks (@key{"}) so the space character is not
interpreted as an option separator.
+Multiple instances of @option{--rename} can be given in one command.
+The keywords will be renamed in the specified order.
+If the keyword does not exist, Fits will give a warning and return with a
non-zero value, but will not stop.
+To stop as soon as an error occurs, run with @option{--quitonerror}.
-@cindex Vector graphics
-@cindex Graphics (vector)
-On the other hand, when something is generated by the computer (for example, a
diagram, plot or even adding a cross over a camera image to highlight something
there), there is no ``observation'' or connection with nature!
-Everything is abstract!
-For such things, it is much easier to draw a mathematical line (with infinite
resolution).
-Therefore, no matter how much you zoom-in, it will never get pixelated.
-This is the realm of @url{https://en.wikipedia.org/wiki/Vector_graphics,
Vector graphics}.
-If you open the Gnuastro manual in
@url{https://www.gnu.org/software/gnuastro/manual/gnuastro.pdf, PDF format} You
can see such graphics in the Gnuastro manual, for example, in @ref{Circles and
the complex plane} or @ref{Distance on a 2D curved space}.
-The most common vector graphics format is PDF for document sharing or SVG for
web-based applications.
+@item -u STR
+@itemx --update=STR
+Update a keyword, its value, its comments and its units in the format
described below.
+If there are multiple instances of the keyword in the header, they will be
changed from top to bottom (with multiple @option{--update} options).
-The pixels of a raster image can be shown as vector-based squares with
different shades, so vector graphics can generally also support raster graphics.
-This is very useful when you want to add some graphics over an image to help
your discussion (for example a @mymath{+} over your object of interest).
-However, vector graphics is not optimized for rasterized data (which are
usually also noisy!), and can either not display nicely, or result in much
larger file volume (in bytes).
-Therefore, if it is not necessary to add any marks over a FITS image, for
example, it may be better to store it in a rasterized format.
+@noindent
+The format of the values to this option can best be specified with an
+example:
-The distinction between the vector and raster graphics is also the primary
theme behind Gnuastro's logo, see @ref{Logo of Gnuastro}.
+@example
+--update=KEYWORD,value,"comments for this keyword",unit
+@end example
+If there is a writing error, Fits will give a warning and return with a
non-zero value, but will not stop.
+To stop as soon as an error occurs, run with @option{--quitonerror}.
-@node Recognized file formats, Color, Raster and Vector graphics, ConvertType
-@subsection Recognized file formats
+@noindent
+The value can be any numerical or string value@footnote{Some tricky situations
arise with values like `@command{87095e5}', if this was intended to be a number
it will be kept in the header as @code{8709500000} and there is no problem.
+But this can also be a shortened Git commit hash.
+In the latter case, it should be treated as a string and stored as it is
written.
+Commit hashes are very important in keeping the history of a file during your
research and such values might arise without you noticing them in your
reproduction pipeline.
+One solution is to use @command{git describe} instead of the short hash alone.
+A less recommended solution is to add a space after the commit hash and Fits
will write the value as `@command{87095e5 }' in the header.
+If you later compare the strings on the shell, the space character will be
ignored by the shell in the latter solution and there will be no problem.}.
+Other than the @code{KEYWORD}, all the other values are optional.
+To leave a given token empty, follow the preceding comma (@key{,}) immediately
with the next.
+If any space character is present around the commas, it will be considered
part of the respective token.
+So if more than one token has space characters within it, the safest method to
specify a value to this option is to put double quotation marks around each
individual token that needs it.
+Note that without double quotation marks, space characters will be seen as
option separators and can lead to undefined behavior.
-The various standards and the file name extensions recognized by ConvertType
are listed below.
-For a review on the difference between Raster and Vector graphics, see
@ref{Raster and Vector graphics}.
-For a review on the concept of color and channels, see @ref{Color}.
-Currently, except for the FITS format, Gnuastro uses the file name's suffix to
identify the format, so if the file's name does not end with one of the
suffixes mentioned below, it will not be recognized.
+@item -w STR
+@itemx --write=STR
+Write a keyword to the header.
+For the possible value input formats, comments and units for the keyword, see
the @option{--update} option above.
+The special names (first string) below will cause a special behavior:
-@table @asis
-@item FITS or IMH
-@cindex IRAF
-@cindex Astronomical data format
-Astronomical data are commonly stored in the FITS format (or the older data
IRAF @file{.imh} format), a list of file name suffixes which indicate that the
file is in this format is given in @ref{Arguments}.
-FITS is a raster graphics format.
+@table @option
-Each image extension of a FITS file only has one value per pixel/element.
-Therefore, when used as input, each input FITS image contributes as one color
channel.
-If you want multiple extensions in one FITS file for different color channels,
you have to repeat the file name multiple times and use the @option{--hdu},
@option{--hdu2}, @option{--hdu3} or @option{--hdu4} options to specify the
different extensions.
+@item /
+Write a ``title'' to the list of keywords.
+A title consists of one blank line and another which is blank for several
spaces and starts with a slash (@key{/}).
+The second string given to this option is the ``title'' or string printed
after the slash.
+For example, with the command below you can add a ``title'' of `My keywords'
after the existing keywords and add the subsequent @code{K1} and @code{K2}
keywords under it (note that keyword names are not case sensitive).
-@item JPEG
-@cindex JPEG format
-@cindex Raster graphics
-@cindex Pixelated graphics
-The JPEG standard was created by the Joint photographic experts group.
-It is currently one of the most commonly used image formats.
-Its major advantage is the compression algorithm that is defined by the
standard.
-Like the FITS standard, this is a raster graphics format, which means that it
is pixelated.
+@example
+$ astfits test.fits -h1 --write=/,"My keywords" \
+ --write=k1,1.23,"My first keyword" \
+ --write=k2,4.56,"My second keyword"
+$ astfits test.fits -h1
+[[[ ... truncated ... ]]]
-A JPEG file can have 1 (for gray-scale), 3 (for RGB) and 4 (for CMYK) color
channels.
-If you only want to convert one JPEG image into other formats, there is no
problem, however, if you want to use it in combination with other input files,
make sure that the final number of color channels does not exceed four.
-If it does, then ConvertType will abort and notify you.
+ / My keywords
+K1 = 1.23 / My first keyword
+K2 = 4.56 / My second keyword
+END
+@end example
-@cindex Suffixes, JPEG images
-The file name endings that are recognized as a JPEG file for input are:
-@file{.jpg}, @file{.JPG}, @file{.jpeg}, @file{.JPEG}, @file{.jpe},
@file{.jif}, @file{.jfif} and @file{.jfi}.
+Adding a ``title'' before each contextually separate group of header keywords
greatly helps in readability and visual inspection of the keywords.
+So generally, when you want to add new FITS keywords, it is good practice to
also add a title before them.
-@item TIFF
-@cindex TIFF format
-TIFF (or Tagged Image File Format) was originally designed as a common format
for scanners in the early 90s and since then it has grown to become very
general.
-In many aspects, the TIFF standard is similar to the FITS image standard: it
can allow data of many types (see @ref{Numeric data types}), and also allows
multiple images to be stored in a single file (like a FITS extension: each
image in the file is called a `directory' in the TIFF standard).
-However, unlike FITS, it can only store images, it has no constructs for
tables.
-Also unlike FITS, each `directory' of a TIFF file can have a multi-channel
(e.g., RGB) image.
-Another (inconvenient) difference with the FITS standard is that keyword names
are stored as numbers, not human-readable text.
+The reason you need to use @key{/} as the keyword name for setting a title is
that @key{/} is the first non-white character.
-However, outside of astronomy, because of its support of different numeric
data types, many fields use TIFF images for accurate (for example, 16-bit
integer or floating point for example) imaging data.
+The title(s) is(are) written into the FITS with the same order that
@option{--write} is called.
+Therefore in one run of the Fits program, you can specify many different
titles (with their own keywords under them).
+For example, the command below that builds on the previous example and adds
another group of keywords named @code{A1} and @code{A2}.
-@item EPS
-@cindex EPS
-@cindex PostScript
-@cindex Vector graphics
-@cindex Encapsulated PostScript
-The Encapsulated PostScript (EPS) format is essentially a one page PostScript
file which has a specified size.
-Postscript is used to store a full document like this whole Gnuastro book.
-PostScript therefore also includes non-image data, for example, lines and
texts.
-It is a fully functional programming language to describe a document.
-A PostScript file is a plain text file that can be edited like any program
source with any plain-text editor.
-Therefore in ConvertType, EPS is only an output format and cannot be used as
input.
-Contrary to the FITS or JPEG formats, PostScript is not a raster format, but
is categorized as vector graphics.
+@example
+$ astfits test.fits -h1 --write=/,"My keywords" \
+ --write=k1,1.23,"My first keyword" \
+ --write=k2,4.56,"My second keyword" \
+ --write=/,"My second group of keywords" \
+ --write=a1,7.89,"First keyword" \
+ --write=a2,0.12,"Second keyword"
+@end example
-@cindex @TeX{}
-@cindex @LaTeX{}
-With these features in mind, you can see that when you are compiling a
document with @TeX{} or @LaTeX{}, using an EPS file is much more low level than
a JPEG and thus you have much greater control and therefore quality.
-Since it also includes vector graphic lines we also use such lines to make a
thin border around the image to make its appearance in the document much better.
-Furthermore, through EPS, you can add marks over the image in many shapes and
colors.
-No matter the resolution of the display or printer, these lines will always be
clear and not pixelated.
-However, this can be done better with tools within @TeX{} or @LaTeX{} such as
PGF/Tikz@footnote{@url{http://sourceforge.net/projects/pgf/}}.
+@item checksum
+@cindex CFITSIO
+@cindex @code{DATASUM}: FITS keyword
+@cindex @code{CHECKSUM}: FITS keyword
+When nothing is given afterwards, the header integrity keywords @code{DATASUM}
and @code{CHECKSUM} will be calculated and written/updated.
+The calculation and writing is done fully by CFITSIO, therefore they comply
with the FITS standard
4.0@footnote{@url{https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf}}
that defines these keywords (its Appendix J).
-@cindex Binary image
-@cindex Saving binary image
-@cindex Black and white image
-If the final input image (possibly after all operations on the flux explained
below) is a binary image or only has two colors of black and white (in
segmentation maps for example), then PostScript has another great advantage
compared to other formats.
-It allows for 1 bit pixels (pixels with a value of 0 or 1), this can decrease
the output file size by 8 times.
-So if a gray-scale image is binary, ConvertType will exploit this property in
the EPS and PDF (see below) outputs.
+If a value is given (e.g., @option{--write=checksum,MyOwnCheckSum}), then
CFITSIO will not be called to calculate these two keywords and the value (as
well as possible comment and unit) will be written just like any other keyword.
+This is generally not recommended since @code{CHECKSUM} is a reserved FITS
standard keyword.
+If you want to calculate the checksum with another hashing standard manually
and write it into the header, it is recommended to use another keyword name.
-@cindex Suffixes, EPS format
-The standard formats for an EPS file are @file{.eps}, @file{.EPS},
@file{.epsf} and @file{.epsi}.
-The EPS outputs of ConvertType have the @file{.eps} suffix.
+In the FITS standard, @code{CHECKSUM} depends on the HDU's data @emph{and}
header keywords, it will therefore not be valid if you make any further changes
to the header after writing the @code{CHECKSUM} keyword.
+This includes any further keyword modification options in the same call to the
Fits program.
+However, @code{DATASUM} only depends on the data section of the HDU/extension,
so it is not changed when you add, remove or update the header keywords.
+Therefore, it is recommended to write these keywords as the last keywords that
are written/modified in the extension.
+You can use the @option{--verify} option (described below) to verify the
values of these two keywords.
-@item PDF
-@cindex PDF
-@cindex Adobe systems
-@cindex PostScript vs. PDF
-@cindex Compiled PostScript
-@cindex Portable Document format
-@cindex Static document description format
-The Portable Document Format (PDF) is currently the most common format for
documents.
-It is a vector graphics format, allowing abstract constructs like marks or
borders.
+@item datasum
+Similar to @option{checksum}, but only write the @code{DATASUM} keyword (that
does not depend on the header keywords, only the data).
+@end table
-The PDF format is based on Postscript, so it shares all the features mentioned
above for EPS.
-To be able to display it is programmed content or print, a Postscript file
needs to pass through a processor or compiler.
-A PDF file can be thought of as the processed output of the PostScript
compiler.
-PostScript, EPS and PDF were created and are registered by Adobe Systems.
+@item -a STR
+@itemx --asis=STR
+Write the given @code{STR} @emph{exactly} as it is, into the given FITS file
header with no modifications.
+If the contents of @code{STR} does not conform to the FITS standard for
keywords, then it may (most probably: it will!) corrupt your file and you may
not be able to open it any more.
+So please be @strong{very careful} with this option (its your responsibility
to make sure that the string conforms with the FITS standard for keywords).
-@cindex Suffixes, PDF format
-@cindex GPL Ghostscript
-As explained under EPS above, a PDF document is a static document description
format, viewing its result is therefore much faster and more efficient than
PostScript.
-To create a PDF output, ConvertType will make an EPS file and convert that to
PDF using GPL Ghostscript.
-The suffixes recognized for a PDF file are: @file{.pdf}, @file{.PDF}.
-If GPL Ghostscript cannot be run on the PostScript file, The EPS will remain
and a warning will be printed (see @ref{Optional dependencies}).
+If you want to define the keyword from scratch, it is best to use the
@option{--write} option (see below) and let CFITSIO worry about complying with
the FITS standard.
+Also, you want to copy keywords from one FITS file to another, you can use
@option{--copykeys} that is described below.
+Through these high-level instances, you don't have to worry about low-level
issues.
-@item @option{blank}
-@cindex @file{blank} color channel
-This is not actually a file type! But can be used to fill one color channel
with a blank value.
-If this argument is given for any color channel, that channel will not be used
in the output.
+One common usage of @option{--asis} occurs when you are given the contents of
a FITS header (many keywords) as a plain-text file (so the format of each
keyword line conforms with the FITS standard, just the file is plain-text, and
you have one keyword per line when you open it in a plain-text editor).
+In that case, Gnuastro's Fits program won't be able to parse it (it doesn't
conform to the FITS standard, which doesn't have a new-line character!).
+With the command below, you can insert those headers in @file{headers.txt}
into @file{img.fits} (its HDU number 1, the default; you can change the HDU to
modify with @option{--hdu}).
-@item Plain text
-@cindex Plain text
-@cindex Suffixes, plain text
-The value of each pixel in a 2D image can be written as a 2D matrix in a
plain-text file.
-Therefore, for the purpose of ConvertType, plain-text files are a
single-channel raster graphics file format.
-
-Plain text files have the advantage that they can be viewed with any text
editor or on the command-line.
-Most programs also support input as plain text files.
-As input, each plain text file is considered to contain one color channel.
-
-In ConvertType, the recognized extensions for plain text files are @file{.txt}
and @file{.dat}.
-As described in @ref{Invoking astconvertt}, if you just give these extensions,
(and not a full filename) as output, then automatic output will be preformed to
determine the final output name (see @ref{Automatic output}).
-Besides these, when the format of a file cannot be recognized from its name,
ConvertType will fall back to plain text mode.
-So you can use any name (even without an extension) for a plain text input or
output.
-Just note that when the suffix is not recognized, automatic output will not be
preformed.
+@example
+$ cat headers.txt \
+ | while read line; do \
+ astfits img.fits --asis="$line"; \
+ done
+@end example
-The basic input/output on plain text images is very similar to how tables are
read/written as described in @ref{Gnuastro text table format}.
-Simply put, the restrictions are very loose, and there is a convention to
define a name, units, data type (see @ref{Numeric data types}), and comments
for the data in a commented line.
-The only difference is that as a table, a text file can contain many datasets
(columns), but as a 2D image, it can only contain one dataset.
-As a result, only one information comment line is necessary for a 2D image,
and instead of the starting `@code{# Column N}' (@code{N} is the column
number), the information line for a 2D image must start with `@code{# Image 1}'.
-When ConvertType is asked to output to plain text file, this information
comment line is written before the image pixel values.
+@cartouche
+@noindent
+@strong{Don't forget a title:} Since the newly added headers in the example
above weren't originally in the file, they are probably some form of high-level
metadata.
+The raw example above will just append the new keywords after the last one.
+Making it hard for human readability (its not clear what this new group of
keywords signify, where they start, and where this group of keywords end).
+To help the human readability of the header, add a title for this group of
keywords before writing them.
+To do that, run the following command before the @command{cat ...} command
above (replace @code{Imported keys} with any title that best describes this
group of new keywords based on their context):
+@example
+$ astfits img.fits --write=/,"Imported keys"
+@end example
+@end cartouche
-When converting an image to plain text, consider the fact that if the image is
large, the number of columns in each line will become very large, possibly
making it very hard to open in some text editors.
+@item -H STR
+@itemx --history STR
+Add a @code{HISTORY} keyword to the header with the given value. A new
@code{HISTORY} keyword will be created for every instance of this option. If
the string given to this option is longer than 70 characters, it will be
separated into multiple keyword cards. If there is an error, Fits will give a
warning and return with a non-zero value, but will not stop. To stop as soon as
an error occurs, run with @option{--quitonerror}.
-@item Standard output (command-line)
-This is very similar to the plain text output, but instead of creating a file
to keep the printed values, they are printed on the command-line.
-This can be very useful when you want to redirect the results directly to
another program in one command with no intermediate file.
-The only difference is that only the pixel values are printed (with no
information comment line).
-To print to the standard output, set the output name to `@file{stdout}'.
+@item -c STR
+@itemx --comment STR
+Add a @code{COMMENT} keyword to the header with the given value.
+Similar to the explanation for @option{--history} above.
-@end table
+@item -t
+@itemx --date
+Put the current date and time in the header.
+If the @code{DATE} keyword already exists in the header, it will be updated.
+If there is a writing error, Fits will give a warning and return with a
non-zero value, but will not stop.
+To stop as soon as an error occurs, run with @option{--quitonerror}.
-@node Color, Annotations for figure in paper, Recognized file formats,
ConvertType
-@subsection Color
+@item -p
+@itemx --printallkeys
+Print the full metadata (keywords, values, units and comments) in the
specified FITS extension (HDU).
+If this option is called along with any of the other keyword editing commands,
as described above, all other editing commands take precedence to this.
+Therefore, it will print the final keywords after all the editing has been
done.
-@cindex RGB
-@cindex Filter
-@cindex Color channel
-@cindex Channel (color)
-Color is generally defined after mixing various data ``channels''.
-The values for each channel usually come a filter that is placed in the
optical path.
-Filters, only allow a certain window of the spectrum to pass (for example, the
SDSS @emph{r} filter only allows light from about 5500 to 7000 Angstroms).
-In digital monitors or common digital cameras, a different set of filters are
used: Red, Green and Blue (commonly known as RGB) that are more optimized to
the eye's perception.
-On the other hand, when printing on paper, standard printers use the cyan,
magenta, yellow and key (CMYK, key=black) color space.
+@item --printkeynames
+Print only the keyword names of the specified FITS extension (HDU), one line
per name.
+This option must be called alone.
-@menu
-* Pixel colors:: Multiple filters in each pixel.
-* Colormaps for single-channel pixels:: Better display of single-filter
images.
-* Vector graphics colors::
-@end menu
+@item -v
+@itemx --verify
+Verify the @code{DATASUM} and @code{CHECKSUM} data integrity keywords of the
FITS standard.
+See the description under the @code{checksum} (under @option{--write}, above)
for more on these keywords.
+This option will print @code{Verified} for both keywords if they can be
verified.
+Otherwise, if they do not exist in the given HDU/extension, it will print
@code{NOT-PRESENT}, and if they cannot be verified it will print
@code{INCORRECT}.
+In the latter case (when the keyword values exist but cannot be verified), the
Fits program will also return with a failure.
-@node Pixel colors, Colormaps for single-channel pixels, Color, Color
-@subsubsection Pixel colors
-@cindex RGB
-@cindex CMYK
-@cindex Image
-@cindex Color
-@cindex Pixels
-@cindex Colormap
-@cindex Primary colors
+By default this function will also print a short description of the
@code{DATASUM} AND @code{CHECKSUM} keywords.
+You can suppress this extra information with @code{--quiet} option.
-@cindex Color channel
-@cindex Channel, color
-As discussed in @ref{Color}, for each displayed/printed pixel of a color
image, the dataset/image has three or four values.
-To store/show the three values for each pixel, cameras and monitors allocate a
certain fraction of each pixel's area to red, green and blue filters.
-These three filters are thus built into the hardware at the pixel level.
+@item --copykeys=INT:INT/STR,STR[,STR]
+Copy the desired set of the input's keyword records, to the to the output
(specified with the @option{--output} and @option{--outhdu} for the filename
and HDU/extension respectively).
+The keywords to copy can be given either as a range (in the format of
@code{INT:INT}, inclusive) or a list of keyword names as comma-separated
strings (@code{STR,STR}), the list can have any number of keyword names.
+More details and examples of the two forms are given below:
-However, because measurement accuracy is very important in scientific
instruments, and we want to do measurements (take images) with various/custom
filters (without having to order a new expensive detector!), scientific
detectors use the full area of the pixel to store one value for it in a
single/mono channel dataset.
-To make measurements in different filters, we just place a filter in the light
path before the detector.
-Therefore, the FITS format that is used to store astronomical datasets is
inherently a mono-channel format (see @ref{Recognized file formats} or
@ref{Fits}).
+@table @asis
+@item Range
+The given string to this option must be two integers separated by a colon
(@key{:}).
+The first integer must be positive (counting of the keyword records starts
from 1).
+The second integer may be negative (zero is not acceptable) or an integer
larger than the first.
-@cindex False color
-@cindex Pseudo color
-When a subject has been imaged in multiple filters, you can feed each
different filter into the red, green and blue channels of your monitor and
obtain a false-colored visualization.
-The reason we say ``false-color'' (or pseudo color) is that generally, the
three data channels you provide are not from the same Red, Green and Blue
filters of your monitor!
-So the observed color on your monitor does not correspond the physical
``color'' that you would have seen if you looked at the object by eye.
-Nevertheless, it is good (and sometimes necessary) for visualization (of
special features).
+A negative second integer means counting from the end.
+So @code{-1} is the last copy-able keyword (not including the @code{END}
keyword).
-In ConvertType, you can do this by giving each separate single-channel dataset
(for example, in the FITS image format) as an argument (in the proper order),
then asking for the output in a format that supports multi-channel datasets
(for example, see the command below, or @ref{ConvertType input and output}).
+To see the header keywords of the input with a number before them, you can
pipe the output of the Fits program (when it prints all the keywords in an
extension) into the @command{cat} program like below:
@example
-$ astconvertt r.fits g.fits b.fits --output=color.jpg
+$ astfits input.fits -h1 | cat -n
@end example
+@item List of names
+The given string to this option must be a comma separated list of keyword
names.
+For example, see the command below:
-@node Colormaps for single-channel pixels, Vector graphics colors, Pixel
colors, Color
-@subsubsection Colormaps for single-channel pixels
-
-@cindex Visualization
-@cindex Colormap, HSV
-@cindex HSV: Hue Saturation Value
-As discussed in @ref{Pixel colors}, color is not defined when a dataset/image
contains a single value for each pixel.
-However, we interact with scientific datasets through monitors or printers.
-They allow multiple channels (independent values) per pixel and produce color
with them (on monitors, this is usually with three channels: Red, Green and
Blue).
-As a result, there is a lot of freedom in visualizing a single-channel dataset.
-
-The mapping of single-channel values to multi-channel colors is called called
a ``color map''.
-Since more information can be put in multiple channels, this usually results
in better visualizing the dynamic range of your single-channel data.
-In ConvertType, you can use the @option{--colormap} option to choose between
different mappings of mono-channel inputs, see @ref{Invoking astconvertt}.
-Below, we will review two of the basic color maps, please see the description
of @option{--colormap} in @ref{Invoking astconvertt} for the full list.
+@example
+$ astfits input.fits -h1 --copykeys=KEY1,KEY2 \
+ --output=output.fits --outhdu=1
+@end example
+Please consider the notes below when copying keywords with names:
@itemize
@item
-@cindex Grayscale
-@cindex Colormap, gray-scale
-The most basic colormap is shades of black (because of its strong contrast
with white).
-This scheme is called @url{https://en.wikipedia.org/wiki/Grayscale, Grayscale}.
-But ultimately, the black is just one color, so with Grayscale, you are not
using the full dynamic range of the three-channel monitor effectively.
-To help in visualization, more complex mappings can be defined.
-
+If the number of characters in the name is more than 8, CFITSIO will place a
@code{HIERARCH} before it.
+In this case simply give the name and do not give the @code{HIERARCH} (which
is a constant and not considered part of the keyword name).
@item
-A slightly more complex color map can be defined when you scale the values to
a range of 0 to 360, and use as it as the ``Hue'' term of the
@url{https://en.wikipedia.org/wiki/HSL_and_HSV, Hue-Saturation-Value} (HSV)
color space (while fixing the ``Saturation'' and ``Value'' terms).
-The increased usage of the monitor's 3-channel color space is indeed better,
but the resulting images can be un-''natural'' to the eye.
+If your keyword name is composed only of digits, do not give it as the first
name given to @option{--copykeys}.
+Otherwise, it will be confused with the range format above.
+You can safely give an only-digit keyword name as the second, or third
requested keywords.
+@item
+If the keyword is repeated more than once in the header, currently only the
first instance will be copied.
+In other words, even if you call @option{--copykeys} multiple times with the
same keyword name, its first instance will be copied.
+If you need to copy multiple instances of the same keyword, please get in
touch with us at @code{bug-gnuastro@@gnu.org}.
@end itemize
-Since grayscale is a commonly used mapping of single-valued datasets, we will
continue with a closer look at how it is stored.
-One way to represent a gray-scale image in different color spaces is to use
the same proportions of the primary colors in each pixel.
-This is the common way most FITS image viewers work: for each pixel, they fill
all the channels with the single value.
-While this is necessary for displaying a dataset, there are downsides when
storing/saving this type of grayscale visualization (for example, in a paper).
-
-@itemize
+@end table
-@item
-Three (for RGB) or four (for CMYK) values have to be stored for every pixel,
this makes the output file very heavy (in terms of bytes).
+@item --outhdu
+The HDU/extension to write the output keywords of @option{--copykeys}.
-@item
-If printing, the printing errors of each color channel can make the printed
image slightly more blurred than it actually is.
+@item -Q
+@itemx --quitonerror
+Quit if any of the operations above are not successful.
+By default if an error occurs, Fits will warn the user of the faulty keyword
and continue with the rest of actions.
-@end itemize
+@item -s STR
+@itemx --datetosec STR
+@cindex Unix epoch time
+@cindex Time, Unix epoch
+@cindex Epoch, Unix time
+Interpret the value of the given keyword in the FITS date format (most
generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}) and return the corresponding Unix
epoch time (number of seconds that have passed since 00:00:00 Thursday, January
1st, 1970).
+The @code{Thh:mm:ss.ddd...} section (specifying the time of day), and also the
@code{.ddd...} (specifying the fraction of a second) are optional.
+The value to this option must be the FITS keyword name that contains the
requested date, for example, @option{--datetosec=DATE-OBS}.
-@cindex PNG standard
-@cindex Single channel CMYK
-To solve both these problems when storing grayscale visualization, the best
way is to save a single-channel dataset into the black channel of the CMYK
color space.
-The JPEG standard is the only common standard that accepts CMYK color space.
+@cindex GNU C Library
+This option can also interpret the older FITS date format
(@code{DD/MM/YYThh:mm:ss.ddd...}) where only two characters are given to the
year.
+In this case (following the GNU C Library), this option will make the
following assumption: values 68 to 99 correspond to the years 1969 to 1999, and
values 0 to 68 as the years 2000 to 2068.
-The JPEG and EPS standards set two sizes for the number of bits in each
channel: 8-bit and 12-bit.
-The former is by far the most common and is what is used in ConvertType.
-Therefore, each channel should have values between 0 to @math{2^8-1=255}.
-From this we see how each pixel in a gray-scale image is one byte (8 bits)
long, in an RGB image, it is 3 bytes long and in CMYK it is 4 bytes long.
-But thanks to the JPEG compression algorithms, when all the pixels of one
channel have the same value, that channel is compressed to one pixel.
-Therefore a Grayscale image and a CMYK image that has only the K-channel
filled are approximately the same file size.
+This is a very useful option for operations on the FITS date values, for
example, sorting FITS files by their dates, or finding the time difference
between two FITS files.
+The advantage of working with the Unix epoch time is that you do not have to
worry about calendar details (for example, the number of days in different
months, or leap years).
-@node Vector graphics colors, , Colormaps for single-channel pixels, Color
-@subsubsection Vector graphics colors
-@cindex Web colors
-@cindex Colors (web)
-When creating vector graphics, ConvertType recognizes the
@url{https://en.wikipedia.org/wiki/Web_colors#Extended_colors, extended web
colors} that are the result of merging the colors in the HTML 4.01, CSS 2.0,
SVG 1.0 and CSS3 standards.
-They are all shown with their standard name in @ref{colornames}.
-The names are not case sensitive so you can use them in any form (for example,
@code{turquoise} is the same as @code{Turquoise} or @code{TURQUOISE}).
+@item --wcscoordsys=STR
+@cindex Galactic coordinate system
+@cindex Ecliptic coordinate system
+@cindex Equatorial coordinate system
+@cindex Supergalactic coordinate system
+@cindex Coordinate system: Galactic
+@cindex Coordinate system: Ecliptic
+@cindex Coordinate system: Equatorial
+@cindex Coordinate system: Supergalactic
+Convert the coordinate system of the image's world coordinate system (WCS) to
the given coordinate system (@code{STR}) and write it into the file given to
@option{--output} (or an automatically named file if no @option{--output} has
been given).
-@cindex 24-bit terminal
-@cindex True color terminal
-@cindex Terminal (true color, 24 bit)
-On the command-line, you can also get the list of colors with the
@option{--listcolors} option to CovertType, like below.
-In particular, if your terminal is 24-bit or "true color", in the last column,
you will see each color.
-This greatly helps in selecting the best color for our purpose easily on the
command-line (without taking your hands off the keyboard and getting
distracted).
+For example, with the command below, @file{img-eq.fits} will have an identical
dataset (pixel values) as @file{image.fits}.
+However, the WCS coordinate system of @file{img-eq.fits} will be the
equatorial coordinate system in the Julian calendar epoch 2000 (which is the
most common epoch used today).
+Fits will automatically extract the current coordinate system of
@file{image.fits} and as long as it is one of the recognized coordinate systems
listed below, it will do the conversion.
@example
-$ astconvertt --listcolors
+$ astfits image.fits --wcscoordsys=eq-j2000 \
+ --output=img-eq.fits
@end example
-@float Figure,colornames
-@center@image{gnuastro-figures/color-names, 15cm, , }
-
-@caption{Recognized color names in Gnuastro, shown with their numerical
identifiers.}
-@end float
-
-@node Annotations for figure in paper, Invoking astconvertt, Color, ConvertType
-@subsection Annotations for figure in paper
+The currently recognized coordinate systems are listed below (the most common
one today is @code{eq-j2000}):
-@cindex Image annotation
-@cindex Annotation of images for paper
-To make a nice figure from your FITS images, it is important to show more than
merely the raw image (converted to a printer friendly format like PDF or JPEG).
-Annotations (or visual metadata) over the raw image greatly help the readers
clearly see your argument and put the image/result in a larger context.
-Examples include:
-@itemize
-@item
-Coordinates (Right Ascension and Declination) on the edges of the image, so
viewers of your paper or presentation slides can get a physical feeling of the
field's sky coverage.
-@item
-Thick line that has a fixed tangential size (for example, in kilo parsecs) at
the redshift/distance of interest.
-@item
-Contours over the image to show radio/X-ray emission, over an optical image
for example.
-@item
-Text, arrows, etc., over certain parts of the image.
-@end itemize
+@table @code
+@item eq-j2000
+2000.0 (Julian-year) equatorial coordinates.
+@item eq-b1950
+1950.0 (Besselian-year) equatorial coordinates.
+@item ec-j2000
+2000.0 (Julian-year) ecliptic coordinates.
+@item ec-b1950
+1950.0 (Besselian-year) ecliptic coordinates.
+@item galactic
+Galactic coordinates.
+@item supergalactic
+Supergalactic coordinates.
+@end table
-@cindex PGFPlots
-Because of the modular philosophy of Gnuastro, ConvertType is only focused on
converting your FITS images to printer friendly formats like JPEG or PDF.
-But to present your results in a slide or paper, you will often need to
annotate the raw JPEG or PDF with some of the features above.
-The good news is that there are many powerful plotting programs that you can
use to add such annotations.
-As a result, there is no point in making a new one, specific to Gnuastro.
-In this section, we will demonstrate this using the very powerful
PGFPlots@footnote{@url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}
package of @LaTeX{}.
+The Equatorial and Ecliptic coordinate systems are defined by the mean equator
and equinox epoch: either the Besselian year 1950.0, or the Julian year 2000.
+For more on their difference and links for further reading about epochs in
astronomy, please see the description in
@url{https://en.wikipedia.org/wiki/Epoch_(astronomy), Wikipedia}.
-@cartouche
-@noindent
-@strong{Single script for easy running:} In this section we are reviewing the
reason and details of every step which is good for educational purposes.
-But when you know the steps already, these separate code blocks can be
annoying.
-Therefore the full script (except for the data download step) is available in
@ref{Full script of annotations on figure}.
-@end cartouche
+@item --wcsdistortion=STR
+@cindex WCS distortion
+@cindex Distortion, WCS
+@cindex SIP WCS distortion
+@cindex TPV WCS distortion
+If the argument has a WCS distortion, the output (file given with the
@option{--output} option) will have the distortion given to this option (for
example, @code{SIP}, @code{TPV}).
+The output will be a new file (with a copy of the image, and the new WCS), so
if it already exists, the file will be delete (unless you use the
@code{--dontdelete} option, see @ref{Input output options}).
-@cindex TiKZ
-@cindex Matplotlib
-PGFPlots uses the same @LaTeX{} graphic engine that typesets your paper/slide.
-Therefore when you build your plots and figures using PGFPlots (and its
underlying package
PGF/TiKZ@footnote{@url{http://mirrors.ctan.org/graphics/pgf/base/doc/pgfmanual.pdf}})
your plots will blend beautifully within your text: same fonts, same colors,
same line properties, etc.
-Since most papers (and presentation slides@footnote{To build slides, @LaTeX{}
has packages like Beamer, see
@url{http://mirrors.ctan.org/macros/latex/contrib/beamer/doc/beameruserguide.pdf}})
are made with @LaTeX{}, PGFPlots is therefore the best tool for those who use
@LaTeX{} to create documents.
-PGFPlots also does not need any extra dependencies beyond a basic/minimal
@TeX{}-live installation, so it is much more reliable than tools like
Matplotlib in Python that have hundreds of fast-evolving
dependencies@footnote{See Figure 1 of Alliez et al. 2020 at
@url{https://arxiv.org/pdf/1905.11123.pdf}}.
+With this option, the Fits program will read the minimal set of keywords from
the input HDU and the HDU data.
+It will then write them into the file given to the @option{--output} option
but with a newly created set of WCS-related keywords corresponding to the
desired distortion standard.
-To demonstrate this, we will create a surface brightness image of a galaxy in
the F160W filter of the ABYSS
survey@footnote{@url{http://research.iac.es/proyecto/abyss}}.
-In the code-block below, let's make a ``build'' directory to keep intermediate
files and avoid populating the source.
-Afterwards, we will download the full image and crop out a 20 arcmin wide
image around the galaxy with the commands below.
-You can run these commands in an empty directory.
+If no @option{--output} file is specified, an automatically generated output
name will be used which is composed of the input's name but with the
@file{-DDD.fits} suffix, see @ref{Automatic output}.
+Where @file{DDD} is the value given to this option (desired output distortion).
-@example
-$ mkdir build
-$ wget http://cdsarc.u-strasbg.fr/ftp/J/A+A/621/A133/fits/ah_f160w.fits
-$ astcrop ah_f160w.fits --center=53.1616278,-27.7802446 --mode=wcs \
- --width=20/3600 --output=build/crop.fits
-@end example
+Note that all possible conversions between all standards are not yet supported.
+If the requested conversion is not supported, an informative error message
will be printed.
+If this happens, please let us know and we will try our best to add the
respective conversions.
-To better show the low surface brightness (LSB) outskirts, we will warp the
image, then convert the pixel units to surface brightness with the commands
below.
-It is very important that the warping is done @emph{before} the conversion to
surface brightness (in units of mag/arcsec@mymath{^2}), because the definition
of surface brightness is non-linear.
-For more, see the surface brightness topic of @ref{Brightness flux magnitude},
and for a more complete tutorial, see @ref{FITS images in a publication}.
+For example, with the command below, you can be sure that if @file{in.fits}
has a distortion in its WCS, the distortion of @file{out.fits} will be in the
SIP standard.
@example
-$ zeropoint=25.94
-$ astwarp build/crop.fits --centeroncorner --scale=1/3 \
- --output=build/scaled.fits
-$ pixarea=$(astfits build/scaled.fits --pixelareaarcsec2)
-$ astarithmetic build/scaled.fits $zeropoint $pixarea counts-to-sb \
- --output=build/sb.fits
+$ astfits in.fits --wcsdistortion=SIP --output=out.fits
@end example
+@end table
-We are now ready to convert the surface brightness image into a PDF.
-To better show the LSB features, we will also limit the color range with the
@code{--fluxlow} and @option{--fluxhigh} options: all pixels with a surface
brightness brighter than 22 mag/arcsec@mymath{^2} will be shown as black, and
all pixels with a surface brightness fainter than 30 mag/arcsec@mymath{^2} will
be white.
-These thresholds are being defined as variables, because we will also need
them below (to pass into PGFPlots).
-We will also set @option{--borderwidth=0}, because the coordinate system we
will add over the image will effectively be a border for the image (separating
it from the background).
-@example
-$ sblow=22
-$ sbhigh=30
-$ astconvertt build/sb.fits --colormap=gray --borderwidth=0 \
- --fluxhigh=$sbhigh --fluxlow=$sblow --output=build/sb.pdf
-@end example
+@node Pixel information images, , Keyword inspection and manipulation,
Invoking astfits
+@subsubsection Pixel information images
+In @ref{Keyword inspection and manipulation} options like
@option{--pixelscale} were introduced for information on the pixels from the
keywords.
+But that only provides a single value for all the pixels!
+This will not be sufficient in some scenarios; for example due to distortion,
different regions of the image will have different pixel areas when projected
onto the sky.
-Please open @file{sb.pdf} and have a look.
-Also, please open @file{sb.fits} in DS9 (or any other FITS viewer) and play
with the color range.
-Can the surface brightness limits be changed to better show the LSB structure?
-If so, you are free to change the limits above.
+@cindex Meta image
+The options in this section provide such ``meta'' images: images where the
pixel values are information about the pixel itself.
+Such images can be useful in understanding the underlying pixel grid with the
same tools that you study the astronomical objects within the image (like
@ref{SAO DS9}).
+After all, nothing beats visual inspection with tools you are familiar with.
-We now have the printable PDF representation of the image, but as discussed
above, it is not enough for a paper.
-We will add 1) a thick line showing the size of 20 kpc (kilo parsecs) at the
redshift of the central galaxy, 2) coordinates and 3) a color bar, showing the
surface brightness level of each grayscale level.
+@table @code
+@item --pixelareaonwcs
+Create a meta-image where each pixel's value shows its area in the WCS units
(usually degrees squared).
+The output is therefore the same size as the input.
-To get the first job done, we first need to know the redshift of the central
galaxy.
-To do this, we can use Gnuastro's Query program to look into all the objects
in NED within this image (only asking for the RA, Dec and redshift columns).
-We will then use the Match program to find the NED entry that corresponds to
our galaxy.
+@cindex Pixel mixing
+@cindex Area resampling
+@cindex Resampling by area
+This option uses the same ``pixel mixing'' or ``area resampling'' concept that
is described in @ref{Resampling} (as part of the Warp program).
+Similar to Warp, its sampling can be tuned with the @option{--edgesampling}
that is described below.
-@example
-$ astquery ned --dataset=objdir --overlapwith=build/sb.fits \
- --column=ra,dec,z --output=ned.fits
-$ astmatch ned.fits -h1 --coord=53.1616278,-27.7802446 \
- --ccol1=RA,Dec --aperture=1/3600
-$ redshift=$(asttable ned_matched.fits -cz)
-$ echo $redshift
-@end example
+@cindex Distortion
+@cindex Area of pixel on sky
+One scenario where this option becomes handy is when you are debugging aligned
images using the Warp program (see @ref{Warp}).
+You may observe gradients after warping and can check if they caused by the
distortion of the instrument or not.
+Such gradients can happen due to distortions because the detectors pixels are
measuring photons from different areas on the sky (or the type of projection
you're seeing).
+This effect is more pronounced in images covering larger portions of the sky,
for instance, the TESS
images@footnote{@url{https://www.nasa.gov/tess-transiting-exoplanet-survey-satellite}}.
-Now that we know the redshift of the central object, we can define the
coordinates of the thick line that will show the length of 20 kpc at that
redshift.
-It will be a horizontal line (fixed Declination) across a range of RA.
-The start of this thick line will be located at the top edge of the image (at
the 95-percent of the width and height of the image).
-With the commands below we will find the three necessary parameters (one
declination and two RAs).
-Just note that in astronomical images, RA increases to the left/east, which is
the reason we are using the minimum and @code{+} to find the RA starting point.
+Here is an example usage of the @option{--pixelareaonwcs} option:
@example
-$ scalelineinkpc=20
-$ coverage=$(astfits build/sb.fits --skycoverage --quiet | awk 'NR==2')
-$ scalelinedec=$(echo $coverage | awk '@{print $4-($4-$3)*0.05@}')
-$ scalelinerastart=$(echo $coverage | awk '@{print $1+($2-$1)*0.05@}')
-$ scalelineraend=$(astcosmiccal --redshift=$redshift --arcsectandist \
- | awk '@{start='$scalelinerastart'; \
- width='$scalelineinkpc'/$1/3600; \
- print start+width@}')
-@end example
+# Check the area each 'input.fits' pixel takes in sky
+$ astfits input.fits -h1 --pixelareaonwcs -o pixarea.fits
-To draw coordinates over the image, we need to feed these values into PGFPlots.
-But manually entering numbers into the PGFPlots source will be very
frustrating and prone to many errors!
-Fortunately there is an easy way to do this: @LaTeX{} macros.
-New macros are defined by this @LaTeX{} command:
-@example
-\newcommand@{\macroname@}@{value@}
-@end example
-@noindent
-Anywhere that @LaTeX{} confronts @code{\macroname}, it will replace
@code{value} when building the output.
-We will have one file called @file{macros.tex} in the build directory and
define macros based on those values.
-We will use the shell's @code{printf} command to write these macro definition
lines into the macro file.
-We just have to use double backslashes in the @code{printf} command, because
backslash is a meaningful character for @code{printf}, but we want to keep one
of them.
-Also, we put a @code{\n} at the end of each line, otherwise, all the commands
will go into a single line of the macro file.
-We will also place the random `@code{ma}' string at the start of all our
@LaTeX{} macros to help identify the macros for this plot.
+# Convert each pixel's area to arcsec^2
+$ astarithmetic pixarea.fits 3600 3600 x x \
+ --output=pixarea_arcsec2.fits
-@example
-$ macros=build/macros.tex
-$ printf '\\newcommand@{\\maScaleDec@}'"@{$scalelinedec@}\n" > $macros
-$ printf '\\newcommand@{\\maScaleRAa@}'"@{$scalelinerastart@}\n" >> $macros
-$ printf '\\newcommand@{\\maScaleRAb@}'"@{$scalelineraend@}\n" >> $macros
-$ printf '\\newcommand@{\\maScaleKpc@}'"@{$scalelineinkpc@}\n" >> $macros
-$ printf '\\newcommand@{\\maCenterZ@}'"@{$redshift@}\n" >> $macros
+# Compare area relative to the actual reported pixel scale
+$ pixarea=$(astfits input.fits --pixelscale -q \
+ | awk '@{print $3@}')
+$ astarithmetic pixarea.fits $pixarea / -o pixarea_rel.fits
@end example
-Please open the macros file after these commands and have a look to see if
they do conform to the expected format above.
-Another set of macros we will need to feed into PGFPlots is the coordinates of
the image corners.
-Fortunately the @code{coverage} variable found above is also useful here.
-We just need to extract each item before feeding it into the macros.
-To do this, we will use AWK and keep each value with the temporary shell
variable `@code{v}'.
+@item --edgesampling=INT
+Extra sampling along the pixel edges for @option{--pixelareaonwcs}.
+The default value is 0, meaning that only the pixel vertices are used.
+Values greater than zero improve the accuracy in the expense of greater time
and memory consumption.
+With that said, the default value of zero usually has a good precision unless
the given image has extreme distortions that produce irregular pixel shapes.
+For more, see @ref{Align pixels with WCS considering distortions}).
-@example
-$ v=$(echo $coverage | awk '@{print $1@}')
-$ printf '\\newcommand@{\\maCropRAMin@}'"@{$v@}\n" >> $macros
-$ v=$(echo $coverage | awk '@{print $2@}')
-$ printf '\\newcommand@{\\maCropRAMax@}'"@{$v@}\n" >> $macros
-$ v=$(echo $coverage | awk '@{print $3@}')
-$ printf '\\newcommand@{\\maCropDecMin@}'"@{$v@}\n" >> $macros
-$ v=$(echo $coverage | awk '@{print $4@}')
-$ printf '\\newcommand@{\\maCropDecMax@}'"@{$v@}\n" >> $macros
-@end example
+@cartouche
+@noindent
+@strong{Caution:} This option does not ``oversample'' the output image!
+Rather, it makes Warp use more points to calculate the @emph{input} pixel area.
+To oversample the output image, set a reasonable @option{--cdelt} value.
+@end cartouche
-Finally, we also need to pass some other numbers to PGFPlots: 1) the major
tick distance (in the coordinate axes that will be printed on the edge of the
image).
-We will assume 7 ticks for this image.
-2) The minimum and maximum surface brightness values that we gave to
ConvertType when making the PDF; PGFPlots will define its color-bar based on
these two values.
+@end table
-@example
-$ v=$(echo $coverage | awk '@{print ($2-$1)/7@}')
-$ printf '\\newcommand@{\\maTickDist@}'"@{$v@}\n" >> $macros
-$ printf '\\newcommand@{\\maSBlow@}'"@{$sblow@}\n" >> $macros
-$ printf '\\newcommand@{\\maSBhigh@}'"@{$sbhigh@}\n" >> $macros
-@end example
-All the necessary numbers are now ready.
-Please copy the contents below into a file called @file{my-figure.tex}.
-This is the PGFPlots source for this particular plot.
-Besides the coordinates and scale-line, we will also add some text over the
image and an orange arrow pointing to the central object with its redshift
printed over it.
-The parameters are generally human-readable, so you should be able to get a
good feeling of every line.
-There are also comments which will show up as a different color when you copy
this into a plain-text editor.
-@verbatim
-\begin{tikzpicture}
- %% Define the coordinates and colorbar
- \begin{axis}[
- at={(0,0)},
- axis on top,
- x dir=reverse,
- scale only axis,
- width=\linewidth,
- height=\linewidth,
- minor tick num=10,
- xmin=\maCropRAMin,
- xmax=\maCropRAMax,
- ymin=\maCropDecMin,
- ymax=\maCropDecMax,
- enlargelimits=false,
- every tick/.style={black},
- xtick distance=\maTickDist,
- ytick distance=\maTickDist,
- yticklabel style={rotate=90},
- ylabel={Declination (degrees)},
- xlabel={Right Ascension (degrees)},
- ticklabel style={font=\small,
- /pgf/number format/.cd, precision=4,/tikz/.cd},
- x label style={at={(axis description cs:0.5,0.02)},
- anchor=north,font=\small},
- y label style={at={(axis description cs:0.07,0.5)},
- anchor=south,font=\small},
- colorbar,
- colormap name=gray,
- point meta min=\maSBlow,
- point meta max=\maSBhigh,
- colorbar style={
- at={(1.01,1)},
- ylabel={Surface brightness (mag/arcsec$^2$)},
- yticklabel style={
- /pgf/number format/.cd, precision=1, /tikz/.cd},
- y label style={at={(axis description cs:5.3,0.5)},
- anchor=south,font=\small},
- },
- ]
- %% Put the image in the proper positions of the plot.
- \addplot graphics[ xmin=\maCropRAMin, xmax=\maCropRAMax,
- ymin=\maCropDecMin, ymax=\maCropDecMax]
- {sb.pdf};
-
- %% Draw the scale factor.
- \addplot[black, line width=5, name=scaleline] coordinates
- {(\maScaleRAa,\maScaleDec) (\maScaleRAb,\maScaleDec)}
- node [anchor=north west] {\large $\maScaleKpc$ kpc};
- \end{axis}
-
- %% Add some text anywhere over the plot. The text is added two
- %% times: the first time with a white background (that with a
- %% certain opacity), the second time just the text with opacity.
- \node[anchor=south west, fill=white, opacity=0.5]
- at (0.01\linewidth,0.01\linewidth)
- {(a) Text can be added here};
- \node[anchor=south west]
- at (0.01\linewidth,0.01\linewidth)
- {(a) Text can be added here};
- %% Add an arrow to highlight certain structures.
- \draw [->, red!70!yellow, line width=5]
- (0.35\linewidth,0.35\linewidth)
- -- node [anchor=south, rotate=45]{$z=\maCenterZ$}
- (0.45\linewidth,0.45\linewidth);
-\end{tikzpicture}
-@end verbatim
-Finally, we need another simple @LaTeX{} source for the main PDF ``report''
that will host this figure.
-This can actually be your paper or slides for example.
-Here, we will suffice to the minimal working example.
-@verbatim
-\documentclass{article}
-%% Import the TiKZ package and activate its "external" feature.
-\usepackage{tikz}
-\usetikzlibrary{external}
-\tikzexternalize
-%% PGFPlots (which uses TiKZ).
-\usepackage{pgfplots}
-\pgfplotsset{axis line style={thick}}
-\pgfplotsset{
- /pgfplots/colormap={gray}{rgb255=(0,0,0) rgb255=(255,255,255)}
-}
-%% Import the macros.
-\input{macros.tex}
-%% Start document.
-\begin{document}
-You can write anything here.
-%% Add the figure and its caption.
-\begin{figure}
- \input{my-figure.tex}
- \caption{A demo image.}
-\end{figure}
-%% Finish the document.
-\end{document}
-@end verbatim
-You are now ready to create the PDF.
-But @LaTeX{} creates many temporary files, so to avoid populating our
top-level directory, we will copy the two @file{.tex} files into the build
directory, go there and run @LaTeX{}.
-Before running it though, we will first delete all the files that have the
name pattern @file{*-figure0*}, these are ``external'' files created by
TiKZ+PGFPlots, including the actual PDF of the figure.
-@example
-$ cp report.tex my-figure.tex build
-$ cd build
-$ rm -f *-figure0*
-$ pdflatex -shell-escape -halt-on-error report.tex
-@end example
-You now have the full ``report'' in @file{report.pdf}.
-Try adding some extra text on top of the figure, or in the caption and
re-running the last four commands.
-Also try changing the 20kpc scale line length to 50kpc, or try changing the
redshift, to see how the length and text of the thick scale-line will
automatically change.
-But the good news is that you also have the raw PDF of the figure that you can
use in other places.
-You can see that file in @file{report-figure0.pdf}.
-In a larger paper, you can add multiple such figures (with different
@file{.tex} files that are placed in different @code{figure} environments with
different captions).
-Each figure will get a number in the build directory.
-TiKZ also allows setting a file name for each ``external'' figure (to avoid
such numbers that can be annoying if the image orders are changed).
-PGFPlots is also highly customizable, you can make a lot of changes and
customizations.
-Both
TiKZ@footnote{@url{http://mirrors.ctan.org/graphics/pgf/base/doc/pgfmanual.pdf}}
and
PGFPLots@footnote{@url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}
have wonderful manuals, so have a look trough them.
-@menu
-* Full script of annotations on figure:: All the steps in one script
-@end menu
-@node Full script of annotations on figure, , Annotations for figure in
paper, Annotations for figure in paper
-@subsubsection Full script of annotations on figure
-In @ref{Annotations for figure in paper}, we each one of the steps to add
annotations over an image were described in detail.
-So if you have understood the steps, but need to add annotations over an
image, repeating those steps individually will be annoying.
-Therefore in this section, we will summarize all the steps in a single script
that you can simply copy-paste into a text editor, configure, and run.
+@node ConvertType, Table, Fits, Data containers
+@section ConvertType
-@cartouche
-@noindent
-@strong{Necessary files:} To run this script, you will need an image to crop
your object from (here assuming it is called @file{ah_f160w.fits} with a
certain zero point) and two @file{my-figure.tex} and @file{report.tex} files
that were fully included in @ref{Annotations for figure in paper}.
-Also, we have brought the redshift as a parameter here.
-But if the center of your image always points to your main object, you can
also include the Query command to automatically find the object's redshift from
NED.
-Alternatively, your image may already be cropped, in this case, you can remove
the cropping step and
-@end cartouche
+@cindex Data format conversion
+@cindex Converting data formats
+@cindex Image format conversion
+@cindex Converting image formats
+@pindex @r{ConvertType (}astconvertt@r{)}
+The FITS format used in astronomy was defined mainly for archiving,
transmission, and processing.
+In other situations, the data might be useful in other formats.
+For example, when you are writing a paper or report, or if you are making
slides for a talk, you cannot use a FITS image.
+Other image formats should be used.
+In other cases you might want your pixel values in a table format as plain
text for input to other programs that do not recognize FITS.
+ConvertType is created for such situations.
+The various types will increase with future updates and based on need.
-@verbatim
-# Parameters.
-sblow=22 # Minimum surface brightness.
-sbhigh=30 # Maximum surface brightness.
-bdir=build # Build directory location on filesystem.
-numticks=7 # Number of major ticks in each axis.
-redshift=0.619 # Redshift of object of interest.
-zeropoint=25.94 # Zero point of input image.
-scalelineinkpc=20 # Length of scale-line (in kilo parsecs).
-input=ah_f160w.fits # Name of input (to crop).
+The conversion is not only one way (from FITS to other formats), but two ways
(except the EPS and PDF formats@footnote{Because EPS and PDF are vector, not
raster/pixelated formats}).
+So you can also convert a JPEG image or text file into a FITS image.
+Basically, other than EPS/PDF, you can use any of the recognized formats as
different color channel inputs to get any of the recognized outputs.
-# Stop the script in case of a crash.
-set -e
+Before explaining the options and arguments (in @ref{Invoking astconvertt}),
we will start with a short discussion on the difference between raster and
vector graphics in @ref{Raster and Vector graphics}.
+In ConvertType, vector graphics are used to add markers over your originally
rasterized data, producing high quality images, ready to be used in your
exciting papers.
+We will continue with a description of the recognized files types in
@ref{Recognized file formats}, followed a short introduction to digital color
in @ref{Color}.
+A tutorial on how to add markers over an image is then given in @ref{Marking
objects for publication} and we conclude with a @LaTeX{} based solution to add
coordinates over an image.
-# Build directory
-if ! [ -d $bdir ]; then mkdir $bdir; fi
+@menu
+* Raster and Vector graphics:: Images coming from nature, and the abstract.
+* Recognized file formats:: Recognized file formats
+* Color:: Some explanations on color.
+* Annotations for figure in paper:: Adding coordinates or physical scale.
+* Invoking astconvertt:: Options and arguments to ConvertType.
+@end menu
-# Crop out the desired region.
-crop=$bdir/crop.fits
-astcrop $input --center=53.1616278,-27.7802446 --mode=wcs \
- --width=20/3600 --output=$crop
+@node Raster and Vector graphics, Recognized file formats, ConvertType,
ConvertType
+@subsection Raster and Vector graphics
-# Warp the image to larger pixels to show surface brightness better.
-scaled=$bdir/scaled.fits
-astwarp $crop --centeroncorner --scale=1/3 --output=$scaled
+@cindex Raster graphics
+@cindex Graphics (raster)
+Images that are produced by a hardware (for example, the camera in your phone,
or the camera connected to a telescope) provide pixelated data.
+Such data are therefore stored in a
@url{https://en.wikipedia.org/wiki/Raster_graphics, Raster graphics} format
which has discrete, independent, equally spaced data elements.
+For example, this is the format used FITS (see @ref{Fits}), JPEG, TIFF, PNG
and other image formats.
-# Calculate the pixel area and convert image to Surface brightness.
-sb=$bdir/sb.fits
-pixarea=$(astfits $scaled --pixelareaarcsec2)
-astarithmetic $scaled $zeropoint $pixarea counts-to-sb \
- --output=$sb
+@cindex Vector graphics
+@cindex Graphics (vector)
+On the other hand, when something is generated by the computer (for example, a
diagram, plot or even adding a cross over a camera image to highlight something
there), there is no ``observation'' or connection with nature!
+Everything is abstract!
+For such things, it is much easier to draw a mathematical line (with infinite
resolution).
+Therefore, no matter how much you zoom-in, it will never get pixelated.
+This is the realm of @url{https://en.wikipedia.org/wiki/Vector_graphics,
Vector graphics}.
+If you open the Gnuastro manual in
@url{https://www.gnu.org/software/gnuastro/manual/gnuastro.pdf, PDF format} You
can see such graphics in the Gnuastro manual, for example, in @ref{Circles and
the complex plane} or @ref{Distance on a 2D curved space}.
+The most common vector graphics format is PDF for document sharing or SVG for
web-based applications.
-# Convert the surface brightness image into PDF.
-sbpdf=$bdir/sb.pdf
-astconvertt $sb --colormap=gray --borderwidth=0 \
- --fluxhigh=$sbhigh --fluxlow=$sblow --output=$sbpdf
+The pixels of a raster image can be shown as vector-based squares with
different shades, so vector graphics can generally also support raster graphics.
+This is very useful when you want to add some graphics over an image to help
your discussion (for example a @mymath{+} over your object of interest).
+However, vector graphics is not optimized for rasterized data (which are
usually also noisy!), and can either not display nicely, or result in much
larger file volume (in bytes).
+Therefore, if it is not necessary to add any marks over a FITS image, for
example, it may be better to store it in a rasterized format.
-# Specify the coordinates of the scale line (specifying a certain
-# width in kpc). We will put it on the top-right side of the image (5%
-# of the full width of the image away from the edge).
-coverage=$(astfits $sb --skycoverage --quiet | awk 'NR==2')
-scalelinedec=$(echo $coverage | awk '{print $4-($4-$3)*0.05}')
-scalelinerastart=$(echo $coverage | awk '{print $1+($2-$1)*0.05}')
-scalelineraend=$(astcosmiccal --redshift=$redshift --arcsectandist \
- | awk '{start='$scalelinerastart'; \
- width='$scalelineinkpc'/$1/3600; \
- print start+width}')
+The distinction between the vector and raster graphics is also the primary
theme behind Gnuastro's logo, see @ref{Logo of Gnuastro}.
-# Write the LaTeX macros to use in plot. Start with the thick line
-# showing tangential distance.
-macros=$bdir/macros.tex
-printf '\\newcommand{\\maScaleDec}'"{$scalelinedec}\n" > $macros
-printf '\\newcommand{\\maScaleRAa}'"{$scalelinerastart}\n" >> $macros
-printf '\\newcommand{\\maScaleRAb}'"{$scalelineraend}\n" >> $macros
-printf '\\newcommand{\\maScaleKpc}'"{$scalelineinkpc}\n" >> $macros
-printf '\\newcommand{\\maCenterZ}'"{$redshift}\n" >> $macros
-# Add image extrema for the coordinates.
-v=$(echo $coverage | awk '{print $1}')
-printf '\\newcommand{\maCropRAMin}'"{$v}\n" >> $macros
-v=$(echo $coverage | awk '{print $2}')
-printf '\\newcommand{\maCropRAMax}'"{$v}\n" >> $macros
-v=$(echo $coverage | awk '{print $3}')
-printf '\\newcommand{\maCropDecMin}'"{$v}\n" >> $macros
-v=$(echo $coverage | awk '{print $4}')
-printf '\\newcommand{\maCropDecMax}'"{$v}\n" >> $macros
+@node Recognized file formats, Color, Raster and Vector graphics, ConvertType
+@subsection Recognized file formats
-# Distance between each tick value.
-v=$(echo $coverage | awk '{print ($2-$1)/'$numticks'}')
-printf '\\newcommand{\maTickDist}'"{$v}\n" >> $macros
-printf '\\newcommand{\maSBlow}'"{$sblow}\n" >> $macros
-printf '\\newcommand{\maSBhigh}'"{$sbhigh}\n" >> $macros
+The various standards and the file name extensions recognized by ConvertType
are listed below.
+For a review on the difference between Raster and Vector graphics, see
@ref{Raster and Vector graphics}.
+For a review on the concept of color and channels, see @ref{Color}.
+Currently, except for the FITS format, Gnuastro uses the file name's suffix to
identify the format, so if the file's name does not end with one of the
suffixes mentioned below, it will not be recognized.
-# Copy the LaTeX source into the build directory and go there to run
-# it and have all the temporary LaTeX files there.
-cp report.tex my-figure.tex $bdir
-cd $bdir
-rm -f *-figure0*
-pdflatex -shell-escape -halt-on-error report.tex
-@end verbatim
+@table @asis
+@item FITS or IMH
+@cindex IRAF
+@cindex Astronomical data format
+Astronomical data are commonly stored in the FITS format (or the older data
IRAF @file{.imh} format), a list of file name suffixes which indicate that the
file is in this format is given in @ref{Arguments}.
+FITS is a raster graphics format.
+Each image extension of a FITS file only has one value per pixel/element.
+Therefore, when used as input, each input FITS image contributes as one color
channel.
+If you want multiple extensions in one FITS file for different color channels,
you have to repeat the file name multiple times and use the @option{--hdu},
@option{--hdu2}, @option{--hdu3} or @option{--hdu4} options to specify the
different extensions.
+@item JPEG
+@cindex JPEG format
+@cindex Raster graphics
+@cindex Pixelated graphics
+The JPEG standard was created by the Joint photographic experts group.
+It is currently one of the most commonly used image formats.
+Its major advantage is the compression algorithm that is defined by the
standard.
+Like the FITS standard, this is a raster graphics format, which means that it
is pixelated.
-@node Invoking astconvertt, , Annotations for figure in paper, ConvertType
-@subsection Invoking ConvertType
+A JPEG file can have 1 (for gray-scale), 3 (for RGB) and 4 (for CMYK) color
channels.
+If you only want to convert one JPEG image into other formats, there is no
problem, however, if you want to use it in combination with other input files,
make sure that the final number of color channels does not exceed four.
+If it does, then ConvertType will abort and notify you.
-ConvertType will convert any recognized input file type to any specified
output type.
-The executable name is @file{astconvertt} with the following general template
+@cindex Suffixes, JPEG images
+The file name endings that are recognized as a JPEG file for input are:
+@file{.jpg}, @file{.JPG}, @file{.jpeg}, @file{.JPEG}, @file{.jpe},
@file{.jif}, @file{.jfif} and @file{.jfi}.
-@example
-$ astconvertt [OPTION...] InputFile [InputFile2] ... [InputFile4]
-@end example
+@item TIFF
+@cindex TIFF format
+TIFF (or Tagged Image File Format) was originally designed as a common format
for scanners in the early 90s and since then it has grown to become very
general.
+In many aspects, the TIFF standard is similar to the FITS image standard: it
can allow data of many types (see @ref{Numeric data types}), and also allows
multiple images to be stored in a single file (like a FITS extension: each
image in the file is called a `directory' in the TIFF standard).
+However, unlike FITS, it can only store images, it has no constructs for
tables.
+Also unlike FITS, each `directory' of a TIFF file can have a multi-channel
(e.g., RGB) image.
+Another (inconvenient) difference with the FITS standard is that keyword names
are stored as numbers, not human-readable text.
-@noindent
-One line examples:
+However, outside of astronomy, because of its support of different numeric
data types, many fields use TIFF images for accurate (for example, 16-bit
integer or floating point for example) imaging data.
-@example
-## Convert an image in FITS to PDF:
-$ astconvertt image.fits --output=pdf
+@item EPS
+@cindex EPS
+@cindex PostScript
+@cindex Vector graphics
+@cindex Encapsulated PostScript
+The Encapsulated PostScript (EPS) format is essentially a one page PostScript
file which has a specified size.
+Postscript is used to store a full document like this whole Gnuastro book.
+PostScript therefore also includes non-image data, for example, lines and
texts.
+It is a fully functional programming language to describe a document.
+A PostScript file is a plain text file that can be edited like any program
source with any plain-text editor.
+Therefore in ConvertType, EPS is only an output format and cannot be used as
input.
+Contrary to the FITS or JPEG formats, PostScript is not a raster format, but
is categorized as vector graphics.
-## Similar to before, but use the Viridis color map:
-$ astconvertt image.fits --colormap=viridis --output=pdf
+@cindex @TeX{}
+@cindex @LaTeX{}
+With these features in mind, you can see that when you are compiling a
document with @TeX{} or @LaTeX{}, using an EPS file is much more low level than
a JPEG and thus you have much greater control and therefore quality.
+Since it also includes vector graphic lines we also use such lines to make a
thin border around the image to make its appearance in the document much better.
+Furthermore, through EPS, you can add marks over the image in many shapes and
colors.
+No matter the resolution of the display or printer, these lines will always be
clear and not pixelated.
+However, this can be done better with tools within @TeX{} or @LaTeX{} such as
PGF/Tikz@footnote{@url{http://sourceforge.net/projects/pgf/}}.
-## Add markers to to highlight parts of the image
-## ('marks.fits' is a table containing coordinates)
-$ astconvertt image.fits --marks=marks.fits --output=pdf
+@cindex Binary image
+@cindex Saving binary image
+@cindex Black and white image
+If the final input image (possibly after all operations on the flux explained
below) is a binary image or only has two colors of black and white (in
segmentation maps for example), then PostScript has another great advantage
compared to other formats.
+It allows for 1 bit pixels (pixels with a value of 0 or 1), this can decrease
the output file size by 8 times.
+So if a gray-scale image is binary, ConvertType will exploit this property in
the EPS and PDF (see below) outputs.
-## Convert an image in JPEG to FITS (with multiple extensions
-## if it has color):
-$ astconvertt image.jpg -oimage.fits
+@cindex Suffixes, EPS format
+The standard formats for an EPS file are @file{.eps}, @file{.EPS},
@file{.epsf} and @file{.epsi}.
+The EPS outputs of ConvertType have the @file{.eps} suffix.
-## Use three 2D arrays to create an RGB JPEG output (two are
-## plain-text, the third is FITS, but all have the same size).
-$ astconvertt f1.txt f2.txt f3.fits -o.jpg
+@item PDF
+@cindex PDF
+@cindex Adobe systems
+@cindex PostScript vs. PDF
+@cindex Compiled PostScript
+@cindex Portable Document format
+@cindex Static document description format
+The Portable Document Format (PDF) is currently the most common format for
documents.
+It is a vector graphics format, allowing abstract constructs like marks or
borders.
-## Use two images and one blank for an RGB EPS output:
-$ astconvertt M31_r.fits M31_g.fits blank -oeps
+The PDF format is based on Postscript, so it shares all the features mentioned
above for EPS.
+To be able to display it is programmed content or print, a Postscript file
needs to pass through a processor or compiler.
+A PDF file can be thought of as the processed output of the PostScript
compiler.
+PostScript, EPS and PDF were created and are registered by Adobe Systems.
-## Directly pass input from output of another program through Standard
-## input (not a file).
-$ cat 2darray.txt | astconvertt -oimg.fits
-@end example
+@cindex Suffixes, PDF format
+@cindex GPL Ghostscript
+As explained under EPS above, a PDF document is a static document description
format, viewing its result is therefore much faster and more efficient than
PostScript.
+To create a PDF output, ConvertType will make an EPS file and convert that to
PDF using GPL Ghostscript.
+The suffixes recognized for a PDF file are: @file{.pdf}, @file{.PDF}.
+If GPL Ghostscript cannot be run on the PostScript file, The EPS will remain
and a warning will be printed (see @ref{Optional dependencies}).
-In the sub-sections below various options that are specific to ConvertType are
grouped in different categories.
-Please see those sections for a detailed discussion on each group and its
options.
-Besides those, ConvertType also shares the @ref{Common options} with other
Gnuastro programs.
-The common options are not repeated here.
+@item @option{blank}
+@cindex @file{blank} color channel
+This is not actually a file type! But can be used to fill one color channel
with a blank value.
+If this argument is given for any color channel, that channel will not be used
in the output.
+@item Plain text
+@cindex Plain text
+@cindex Suffixes, plain text
+The value of each pixel in a 2D image can be written as a 2D matrix in a
plain-text file.
+Therefore, for the purpose of ConvertType, plain-text files are a
single-channel raster graphics file format.
-@menu
-* ConvertType input and output:: Input/output file names and formats.
-* Pixel visualization:: Visualizing the pixels in the output.
-* Drawing with vector graphics:: Adding marks in many shapes and colors over
the pixels.
-@end menu
+Plain text files have the advantage that they can be viewed with any text
editor or on the command-line.
+Most programs also support input as plain text files.
+As input, each plain text file is considered to contain one color channel.
-@node ConvertType input and output, Pixel visualization, Invoking astconvertt,
Invoking astconvertt
-@subsubsection ConvertType input and output
+In ConvertType, the recognized extensions for plain text files are @file{.txt}
and @file{.dat}.
+As described in @ref{Invoking astconvertt}, if you just give these extensions,
(and not a full filename) as output, then automatic output will be preformed to
determine the final output name (see @ref{Automatic output}).
+Besides these, when the format of a file cannot be recognized from its name,
ConvertType will fall back to plain text mode.
+So you can use any name (even without an extension) for a plain text input or
output.
+Just note that when the suffix is not recognized, automatic output will not be
preformed.
-@cindex Standard input
-At most four input files (one for each color channel for formats that allow
it) are allowed in ConvertType.
-The first input dataset can either be a file, or come from Standard input (see
@ref{Standard input} and @ref{Recognized file formats}).
+The basic input/output on plain text images is very similar to how tables are
read/written as described in @ref{Gnuastro text table format}.
+Simply put, the restrictions are very loose, and there is a convention to
define a name, units, data type (see @ref{Numeric data types}), and comments
for the data in a commented line.
+The only difference is that as a table, a text file can contain many datasets
(columns), but as a 2D image, it can only contain one dataset.
+As a result, only one information comment line is necessary for a 2D image,
and instead of the starting `@code{# Column N}' (@code{N} is the column
number), the information line for a 2D image must start with `@code{# Image 1}'.
+When ConvertType is asked to output to plain text file, this information
comment line is written before the image pixel values.
-The order of multiple input files is important.
-After reading the input file(s) the number of color channels in all the inputs
will be used to define which color space to use for the outputs and how each
color channel is interpreted: 1 (for grayscale), 3 (for RGB) and 4 (for CMYK)
input channels.
-For more on pixel color channels, see @ref{Pixel colors}.
-Depending on the format of the input(s), the number of input files can differ.
+When converting an image to plain text, consider the fact that if the image is
large, the number of columns in each line will become very large, possibly
making it very hard to open in some text editors.
-For example, if you plan to build an RGB PDF and your three channels are in
the first HDU of @file{r.fits}, @file{g.fits} and @file{b.fits}, then you can
simply call MakeProfiles like this:
+@item Standard output (command-line)
+This is very similar to the plain text output, but instead of creating a file
to keep the printed values, they are printed on the command-line.
+This can be very useful when you want to redirect the results directly to
another program in one command with no intermediate file.
+The only difference is that only the pixel values are printed (with no
information comment line).
+To print to the standard output, set the output name to `@file{stdout}'.
-@example
-$ astconvertt r.fits g.fits b.fits -g1 --output=rgb.pdf
-@end example
+@end table
-@noindent
-However, if the three color channels are in three extensions (assuming the
HDUs are respectively named @code{R}, @code{G} and @code{B}) of a single file
(assuming @file{channels.fits}), you should run it like this:
+@node Color, Annotations for figure in paper, Recognized file formats,
ConvertType
+@subsection Color
-@example
-$ astconvertt channels.fits -hR -hG -hB --output=rgb.pdf
-@end example
+@cindex RGB
+@cindex Filter
+@cindex Color channel
+@cindex Channel (color)
+Color is generally defined after mixing various data ``channels''.
+The values for each channel usually come a filter that is placed in the
optical path.
+Filters, only allow a certain window of the spectrum to pass (for example, the
SDSS @emph{r} filter only allows light from about 5500 to 7000 Angstroms).
+In digital monitors or common digital cameras, a different set of filters are
used: Red, Green and Blue (commonly known as RGB) that are more optimized to
the eye's perception.
+On the other hand, when printing on paper, standard printers use the cyan,
magenta, yellow and key (CMYK, key=black) color space.
-@noindent
-On the other hand, if the channels are already in a multi-channel format (like
JPEG), you can simply provide that file:
+@menu
+* Pixel colors:: Multiple filters in each pixel.
+* Colormaps for single-channel pixels:: Better display of single-filter
images.
+* Vector graphics colors::
+@end menu
-@example
-$ astconvertt image.jpg --output=rgb.pdf
-@end example
-@noindent
-If multiple channels are given as input, and the output format does not
support multiple color channels (for example, FITS), ConvertType will put the
channels in different HDUs, like the example below.
-After running the @command{astfits} command, if your JPEG file was not
grayscale (single channel), you will see multiple HDUs in @file{channels.fits}.
+@node Pixel colors, Colormaps for single-channel pixels, Color, Color
+@subsubsection Pixel colors
+@cindex RGB
+@cindex CMYK
+@cindex Image
+@cindex Color
+@cindex Pixels
+@cindex Colormap
+@cindex Primary colors
+
+@cindex Color channel
+@cindex Channel, color
+As discussed in @ref{Color}, for each displayed/printed pixel of a color
image, the dataset/image has three or four values.
+To store/show the three values for each pixel, cameras and monitors allocate a
certain fraction of each pixel's area to red, green and blue filters.
+These three filters are thus built into the hardware at the pixel level.
+
+However, because measurement accuracy is very important in scientific
instruments, and we want to do measurements (take images) with various/custom
filters (without having to order a new expensive detector!), scientific
detectors use the full area of the pixel to store one value for it in a
single/mono channel dataset.
+To make measurements in different filters, we just place a filter in the light
path before the detector.
+Therefore, the FITS format that is used to store astronomical datasets is
inherently a mono-channel format (see @ref{Recognized file formats} or
@ref{Fits}).
+
+@cindex False color
+@cindex Pseudo color
+When a subject has been imaged in multiple filters, you can feed each
different filter into the red, green and blue channels of your monitor and
obtain a false-colored visualization.
+The reason we say ``false-color'' (or pseudo color) is that generally, the
three data channels you provide are not from the same Red, Green and Blue
filters of your monitor!
+So the observed color on your monitor does not correspond the physical
``color'' that you would have seen if you looked at the object by eye.
+Nevertheless, it is good (and sometimes necessary) for visualization (of
special features).
+
+In ConvertType, you can do this by giving each separate single-channel dataset
(for example, in the FITS image format) as an argument (in the proper order),
then asking for the output in a format that supports multi-channel datasets
(for example, see the command below, or @ref{ConvertType input and output}).
@example
-$ astconvertt image.jpg --output=channels.fits
-$ astfits channels.fits
+$ astconvertt r.fits g.fits b.fits --output=color.jpg
@end example
-As shown above, the output's file format will be interpreted from the name
given to the @option{--output} option (as a common option to all Gnuastro
programs, for the description of @option{--output}, see @ref{Input output
options}).
-It can either be given on the command-line or in any of the configuration
files (see @ref{Configuration files}).
-When the output suffix is not recognized, it will default to plain text
format, see @ref{Recognized file formats}.
-If there is one input dataset (color channel) the output will be gray-scale.
-When three input datasets (color channels) are given, they are respectively
considered to be the red, green and blue color channels.
-Finally, if there are four color channels they will be cyan, magenta, yellow
and black (CMYK colors).
+@node Colormaps for single-channel pixels, Vector graphics colors, Pixel
colors, Color
+@subsubsection Colormaps for single-channel pixels
-The value to @option{--output} (or @option{-o}) can be either a full file name
or just the suffix of the desired output format.
-In the former case (full name), it will be directly used for the output's file
name.
-In the latter case, the name of the output file will be set based on the
automatic output guidelines, see @ref{Automatic output}.
-Note that the suffix name can optionally start with a @file{.} (dot), so for
example, @option{--output=.jpg} and @option{--output=jpg} are equivalent.
-See @ref{Recognized file formats}.
+@cindex Visualization
+@cindex Colormap, HSV
+@cindex HSV: Hue Saturation Value
+As discussed in @ref{Pixel colors}, color is not defined when a dataset/image
contains a single value for each pixel.
+However, we interact with scientific datasets through monitors or printers.
+They allow multiple channels (independent values) per pixel and produce color
with them (on monitors, this is usually with three channels: Red, Green and
Blue).
+As a result, there is a lot of freedom in visualizing a single-channel dataset.
-The relevant options for input/output formats are described below:
+The mapping of single-channel values to multi-channel colors is called called
a ``color map''.
+Since more information can be put in multiple channels, this usually results
in better visualizing the dynamic range of your single-channel data.
+In ConvertType, you can use the @option{--colormap} option to choose between
different mappings of mono-channel inputs, see @ref{Invoking astconvertt}.
+Below, we will review two of the basic color maps, please see the description
of @option{--colormap} in @ref{Invoking astconvertt} for the full list.
-@table @option
-@item -h STR/INT
-@itemx --hdu=STR/INT
-Input HDU name or counter (counting from 0) for each input FITS file.
-If the same HDU should be used from all the FITS files, you can use the
@option{--globalhdu} option described below.
-In ConvertType, it is possible to call the HDU option multiple times for the
different input FITS or TIFF files in the same order that they are called on
the command-line.
-Note that in the TIFF standard, one `directory' (similar to a FITS HDU) may
contain multiple color channels (for example, when the image is in RGB).
+@itemize
+@item
+@cindex Grayscale
+@cindex Colormap, gray-scale
+The most basic colormap is shades of black (because of its strong contrast
with white).
+This scheme is called @url{https://en.wikipedia.org/wiki/Grayscale, Grayscale}.
+But ultimately, the black is just one color, so with Grayscale, you are not
using the full dynamic range of the three-channel monitor effectively.
+To help in visualization, more complex mappings can be defined.
-Except for the fact that multiple calls are possible, this option is identical
to the common @option{--hdu} in @ref{Input output options}.
-The number of calls to this option cannot be less than the number of input
FITS or TIFF files, but if there are more, the extra HDUs will be ignored, note
that they will be read in the order described in @ref{Configuration file
precedence}.
+@item
+A slightly more complex color map can be defined when you scale the values to
a range of 0 to 360, and use as it as the ``Hue'' term of the
@url{https://en.wikipedia.org/wiki/HSL_and_HSV, Hue-Saturation-Value} (HSV)
color space (while fixing the ``Saturation'' and ``Value'' terms).
+The increased usage of the monitor's 3-channel color space is indeed better,
but the resulting images can be un-''natural'' to the eye.
+@end itemize
-Unlike CFITSIO, libtiff (which is used to read TIFF files) only recognizes
numbers (counting from zero, similar to CFITSIO) for `directory' identification.
-Hence the concept of names is not defined for the directories and the values
to this option for TIFF files must be numbers.
+Since grayscale is a commonly used mapping of single-valued datasets, we will
continue with a closer look at how it is stored.
+One way to represent a gray-scale image in different color spaces is to use
the same proportions of the primary colors in each pixel.
+This is the common way most FITS image viewers work: for each pixel, they fill
all the channels with the single value.
+While this is necessary for displaying a dataset, there are downsides when
storing/saving this type of grayscale visualization (for example, in a paper).
-@item -g STR/INT
-@itemx --globalhdu=STR/INT
-Use the value given to this option (a HDU name or a counter, starting from 0)
for the HDU identifier of all the input FITS files.
-This is useful when all the inputs are distributed in different files, but
have the same HDU in those files.
+@itemize
-@item -w FLT
-@itemx --widthincm=FLT
-The width of the output in centimeters.
-This is only relevant for those formats that accept such a width as metadata
(not FITS or plain-text for example), see @ref{Recognized file formats}.
-For most digital purposes, the number of pixels is far more important than the
value to this parameter because you can adjust the absolute width (in inches or
centimeters) in your document preparation program.
+@item
+Three (for RGB) or four (for CMYK) values have to be stored for every pixel,
this makes the output file very heavy (in terms of bytes).
-@item -x
-@itemx --hex
-@cindex ASCII85 encoding
-@cindex Hexadecimal encoding
-Use Hexadecimal encoding in creating EPS output.
-By default the ASCII85 encoding is used which provides a much better
compression ratio.
-When converted to PDF (or included in @TeX{} or @LaTeX{} which is finally
saved as a PDF file), an efficient binary encoding is used which is far more
efficient than both of them.
-The choice of EPS encoding will thus have no effect on the final PDF.
+@item
+If printing, the printing errors of each color channel can make the printed
image slightly more blurred than it actually is.
-So if you want to transfer your EPS files (for example, if you want to submit
your paper to arXiv or journals in PostScript), their storage might become
important if you have large images or lots of small ones.
-By default ASCII85 encoding is used which offers a much better compression
ratio (nearly 40 percent) compared to Hexadecimal encoding.
+@end itemize
-@item -u INT
-@itemx --quality=INT
-@cindex JPEG compression quality
-@cindex Compression quality in JPEG
-@cindex Quality of compression in JPEG
-The quality (compression) of the output JPEG file with values from 0 to 100
(inclusive).
-For other formats the value to this option is ignored.
-Note that only in gray-scale (when one input color channel is given) will this
actually be the exact quality (each pixel will correspond to one input value).
-If it is in color mode, some degradation will occur.
-While the JPEG standard does support loss-less graphics, it is not commonly
supported.
-@end table
+@cindex PNG standard
+@cindex Single channel CMYK
+To solve both these problems when storing grayscale visualization, the best
way is to save a single-channel dataset into the black channel of the CMYK
color space.
+The JPEG standard is the only common standard that accepts CMYK color space.
-@node Pixel visualization, Drawing with vector graphics, ConvertType input and
output, Invoking astconvertt
-@subsubsection Pixel visualization
+The JPEG and EPS standards set two sizes for the number of bits in each
channel: 8-bit and 12-bit.
+The former is by far the most common and is what is used in ConvertType.
+Therefore, each channel should have values between 0 to @math{2^8-1=255}.
+From this we see how each pixel in a gray-scale image is one byte (8 bits)
long, in an RGB image, it is 3 bytes long and in CMYK it is 4 bytes long.
+But thanks to the JPEG compression algorithms, when all the pixels of one
channel have the same value, that channel is compressed to one pixel.
+Therefore a Grayscale image and a CMYK image that has only the K-channel
filled are approximately the same file size.
-The main goal of ConvertType is to visualize pixels to/from print or web
friendly formats.
+@node Vector graphics colors, , Colormaps for single-channel pixels, Color
+@subsubsection Vector graphics colors
+@cindex Web colors
+@cindex Colors (web)
+When creating vector graphics, ConvertType recognizes the
@url{https://en.wikipedia.org/wiki/Web_colors#Extended_colors, extended web
colors} that are the result of merging the colors in the HTML 4.01, CSS 2.0,
SVG 1.0 and CSS3 standards.
+They are all shown with their standard name in @ref{colornames}.
+The names are not case sensitive so you can use them in any form (for example,
@code{turquoise} is the same as @code{Turquoise} or @code{TURQUOISE}).
-Astronomical data usually have a very large dynamic range (difference between
maximum and minimum value) and different subjects might be better demonstrated
with a limited flux range.
+@cindex 24-bit terminal
+@cindex True color terminal
+@cindex Terminal (true color, 24 bit)
+On the command-line, you can also get the list of colors with the
@option{--listcolors} option to CovertType, like below.
+In particular, if your terminal is 24-bit or "true color", in the last column,
you will see each color.
+This greatly helps in selecting the best color for our purpose easily on the
command-line (without taking your hands off the keyboard and getting
distracted).
-@table @option
-@item --colormap=STR[,FLT,...]
-The color map to visualize a single channel.
-The first value given to this option is the name of the color map, which is
shown below.
-Some color maps can be configured.
-In this case, the configuration parameters are optionally given as numbers
following the name of the color map for example, see @option{hsv}.
-The table below contains the usable names of the color maps that are currently
supported:
+@example
+$ astconvertt --listcolors
+@end example
-@table @option
-@item gray
-@itemx grey
-@cindex Colorspace, gray-scale
-Grayscale color map.
-This color map does not have any parameters.
-The full dataset range will be scaled to 0 and @mymath{2^8-1=255} to be stored
in the requested format.
+@float Figure,colornames
+@center@image{gnuastro-figures/color-names, 15cm, , }
-@item hsv
-@cindex Colorspace, HSV
-@cindex Hue, saturation, value
-@cindex HSV: Hue Saturation Value
-Hue, Saturation,
Value@footnote{@url{https://en.wikipedia.org/wiki/HSL_and_HSV}} color map.
-If no values are given after the name (@option{--colormap=hsv}), the dataset
will be scaled to 0 and 360 for hue covering the full spectrum of colors.
-However, you can limit the range of hue (to show only a special color range)
by explicitly requesting them after the name (for example,
@option{--colormap=hsv,20,240}).
+@caption{Recognized color names in Gnuastro, shown with their numerical
identifiers.}
+@end float
-The mapping of a single-channel dataset to HSV is done through the Hue and
Value elements: Lower dataset elements have lower ``value'' @emph{and} lower
``hue''.
-This creates darker colors for fainter parts, while also respecting the range
of colors.
+@node Annotations for figure in paper, Invoking astconvertt, Color, ConvertType
+@subsection Annotations for figure in paper
-@item viridis
-@cindex matplotlib
-@cindex Colormap: Viridis
-@cindex Viridis: Colormap
-Viridis is the default colormap of the popular Matplotlib module of Python and
available in many other visualization tools like PGFPlots.
+@cindex Image annotation
+@cindex Annotation of images for paper
+To make a nice figure from your FITS images, it is important to show more than
merely the raw image (converted to a printer friendly format like PDF or JPEG).
+Annotations (or visual metadata) over the raw image greatly help the readers
clearly see your argument and put the image/result in a larger context.
+Examples include:
+@itemize
+@item
+Coordinates (Right Ascension and Declination) on the edges of the image, so
viewers of your paper or presentation slides can get a physical feeling of the
field's sky coverage.
+@item
+Thick line that has a fixed tangential size (for example, in kilo parsecs) at
the redshift/distance of interest.
+@item
+Contours over the image to show radio/X-ray emission, over an optical image
for example.
+@item
+Text, arrows, etc., over certain parts of the image.
+@end itemize
-@item sls
-@cindex DS9
-@cindex SAO DS9
-@cindex SLS Color
-@cindex Colormap: SLS
-The SLS color range, taken from the commonly used @url{http://ds9.si.edu,SAO
DS9}.
-The advantage of this color range is that it starts with black, going into
dark blue and finishes with the brighter colors of red and white.
-So unlike the HSV color range, it includes black and white and brighter colors
(like yellow, red) show the larger values.
+@cindex PGFPlots
+Because of the modular philosophy of Gnuastro, ConvertType is only focused on
converting your FITS images to printer friendly formats like JPEG or PDF.
+But to present your results in a slide or paper, you will often need to
annotate the raw JPEG or PDF with some of the features above.
+The good news is that there are many powerful plotting programs that you can
use to add such annotations.
+As a result, there is no point in making a new one, specific to Gnuastro.
+In this section, we will demonstrate this using the very powerful
PGFPlots@footnote{@url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}
package of @LaTeX{}.
-@item sls-inverse
-@cindex Colormap: SLS-inverse
-The inverse of the SLS color map (see above), where the lowest value
corresponds to white and the highest value is black.
-While SLS is good for visualizing on the monitor, SLS-inverse is good for
printing.
-@end table
+@cartouche
+@noindent
+@strong{Single script for easy running:} In this section we are reviewing the
reason and details of every step which is good for educational purposes.
+But when you know the steps already, these separate code blocks can be
annoying.
+Therefore the full script (except for the data download step) is available in
@ref{Full script of annotations on figure}.
+@end cartouche
-@item --rgbtohsv
-When there are three input channels and the output is in the FITS format,
interpret the three input channels as red, green and blue channels (RGB) and
convert them to the hue, saturation, value (HSV) color space.
+@cindex TiKZ
+@cindex Matplotlib
+PGFPlots uses the same @LaTeX{} graphic engine that typesets your paper/slide.
+Therefore when you build your plots and figures using PGFPlots (and its
underlying package
PGF/TiKZ@footnote{@url{http://mirrors.ctan.org/graphics/pgf/base/doc/pgfmanual.pdf}})
your plots will blend beautifully within your text: same fonts, same colors,
same line properties, etc.
+Since most papers (and presentation slides@footnote{To build slides, @LaTeX{}
has packages like Beamer, see
@url{http://mirrors.ctan.org/macros/latex/contrib/beamer/doc/beameruserguide.pdf}})
are made with @LaTeX{}, PGFPlots is therefore the best tool for those who use
@LaTeX{} to create documents.
+PGFPlots also does not need any extra dependencies beyond a basic/minimal
@TeX{}-live installation, so it is much more reliable than tools like
Matplotlib in Python that have hundreds of fast-evolving
dependencies@footnote{See Figure 1 of Alliez et al. 2020 at
@url{https://arxiv.org/pdf/1905.11123.pdf}}.
-The currently supported output formats of ConvertType do not have native
support for HSV.
-Therefore this option is only supported when the output is in FITS format and
each of the hue, saturation and value arrays can be saved as one FITS extension
in the output for further analysis (for example, to select a certain color).
+To demonstrate this, we will create a surface brightness image of a galaxy in
the F160W filter of the ABYSS
survey@footnote{@url{http://research.iac.es/proyecto/abyss}}.
+In the code-block below, let's make a ``build'' directory to keep intermediate
files and avoid populating the source.
+Afterwards, we will download the full image and crop out a 20 arcmin wide
image around the galaxy with the commands below.
+You can run these commands in an empty directory.
-@item -c STR
-@itemx --change=STR
-@cindex Change converted pixel values
-(@option{=STR}) Change pixel values with the following format
@option{"from1:to1, from2:to2,..."}.
-This option is very useful in displaying labeled pixels (not actual data
images which have noise) like segmentation maps.
-In labeled images, usually a group of pixels have a fixed integer value.
-With this option, you can manipulate the labels before the image is displayed
to get a better output for print or to emphasize on a particular set of labels
and ignore the rest.
-The labels in the images will be changed in the same order given.
-By default first the pixel values will be converted then the pixel values will
be truncated (see @option{--fluxlow} and @option{--fluxhigh}).
+@example
+$ mkdir build
+$ wget http://cdsarc.u-strasbg.fr/ftp/J/A+A/621/A133/fits/ah_f160w.fits
+$ astcrop ah_f160w.fits --center=53.1616278,-27.7802446 --mode=wcs \
+ --width=20/3600 --output=build/crop.fits
+@end example
-You can use any number for the values irrespective of your final output, your
given values are stored and used in the double precision floating point format.
-So for example, if your input image has labels from 1 to 20000 and you only
want to display those with labels 957 and 11342 then you can run ConvertType
with these options:
+To better show the low surface brightness (LSB) outskirts, we will warp the
image, then convert the pixel units to surface brightness with the commands
below.
+It is very important that the warping is done @emph{before} the conversion to
surface brightness (in units of mag/arcsec@mymath{^2}), because the definition
of surface brightness is non-linear.
+For more, see the surface brightness topic of @ref{Brightness flux magnitude},
and for a more complete tutorial, see @ref{FITS images in a publication}.
@example
-$ astconvertt --change=957:50000,11342:50001 --fluxlow=5e4 \
- --fluxhigh=1e5 segmentationmap.fits --output=jpg
+$ zeropoint=25.94
+$ astwarp build/crop.fits --centeroncorner --scale=1/3 \
+ --output=build/scaled.fits
+$ pixarea=$(astfits build/scaled.fits --pixelareaarcsec2)
+$ astarithmetic build/scaled.fits $zeropoint $pixarea counts-to-sb \
+ --output=build/sb.fits
@end example
-@noindent
-While the output JPEG format is only 8 bit, this operation is done in an
intermediate step which is stored in double precision floating point.
-The pixel values are converted to 8-bit after all operations on the input
fluxes have been complete.
-By placing the value in double quotes you can use as many spaces as you like
for better readability.
-
-@item -C
-@itemx --changeaftertrunc
-Change pixel values (with @option{--change}) after truncation of the flux
values, by default it is the opposite.
+We are now ready to convert the surface brightness image into a PDF.
+To better show the LSB features, we will also limit the color range with the
@code{--fluxlow} and @option{--fluxhigh} options: all pixels with a surface
brightness brighter than 22 mag/arcsec@mymath{^2} will be shown as black, and
all pixels with a surface brightness fainter than 30 mag/arcsec@mymath{^2} will
be white.
+These thresholds are being defined as variables, because we will also need
them below (to pass into PGFPlots).
+We will also set @option{--borderwidth=0}, because the coordinate system we
will add over the image will effectively be a border for the image (separating
it from the background).
-@item -L FLT
-@itemx --fluxlow=FLT
-The minimum flux (pixel value) to display in the output image, any pixel value
below this value will be set to this value in the output.
-If the value to this option is the same as @option{--fluxhigh}, then no flux
truncation will be applied.
-Note that when multiple channels are given, this value is used for all the
color channels.
+@example
+$ sblow=22
+$ sbhigh=30
+$ astconvertt build/sb.fits --colormap=gray --borderwidth=0 \
+ --fluxhigh=$sbhigh --fluxlow=$sblow --output=build/sb.pdf
+@end example
-@item -H FLT
-@itemx --fluxhigh=FLT
-The maximum flux (pixel value) to display in the output image, see
-@option{--fluxlow}.
+Please open @file{sb.pdf} and have a look.
+Also, please open @file{sb.fits} in DS9 (or any other FITS viewer) and play
with the color range.
+Can the surface brightness limits be changed to better show the LSB structure?
+If so, you are free to change the limits above.
-@item -m INT
-@itemx --maxbyte=INT
-This is only used for the JPEG and EPS output formats which have an 8-bit
space for each channel of each pixel.
-The maximum value in each pixel can therefore be @mymath{2^8-1=255}.
-With this option you can change (decrease) the maximum value.
-By doing so you will decrease the dynamic range.
-It can be useful if you plan to use those values for other purposes.
+We now have the printable PDF representation of the image, but as discussed
above, it is not enough for a paper.
+We will add 1) a thick line showing the size of 20 kpc (kilo parsecs) at the
redshift of the central galaxy, 2) coordinates and 3) a color bar, showing the
surface brightness level of each grayscale level.
-@item -A
-@itemx --forcemin
-Enforce the value of @option{--fluxlow} (when it is given), even if it is
smaller than the minimum of the dataset and the output is format supporting
color.
-This is particularly useful when you are converting a number of images to a
common image format like JPEG or PDF with a single command and want them all to
have the same range of colors, independent of the contents of the dataset.
-Note that if the minimum value is smaller than @option{--fluxlow}, then this
option is redundant.
+To get the first job done, we first need to know the redshift of the central
galaxy.
+To do this, we can use Gnuastro's Query program to look into all the objects
in NED within this image (only asking for the RA, Dec and redshift columns).
+We will then use the Match program to find the NED entry that corresponds to
our galaxy.
-@cindex PDF
-@cindex EPS
-@cindex PostScript
-By default, when the dataset only has two values, @emph{and} the output format
is PDF or EPS, ConvertType will use the PostScript optimization that allows
setting the pixel values per bit, not byte (@ref{Recognized file formats}).
-This can greatly help reduce the file size.
-However, when @option{--fluxlow} or @option{--fluxhigh} are called, this
optimization is disabled: even though there are only two values (is binary),
the difference between them does not correspond to the full contrast of black
and white.
+@example
+$ astquery ned --dataset=objdir --overlapwith=build/sb.fits \
+ --column=ra,dec,z --output=ned.fits
+$ astmatch ned.fits -h1 --coord=53.1616278,-27.7802446 \
+ --ccol1=RA,Dec --aperture=1/3600
+$ redshift=$(asttable ned_matched.fits -cz)
+$ echo $redshift
+@end example
-@item -B
-@itemx --forcemax
-Similar to @option{--forcemin}, but for the maximum.
+Now that we know the redshift of the central object, we can define the
coordinates of the thick line that will show the length of 20 kpc at that
redshift.
+It will be a horizontal line (fixed Declination) across a range of RA.
+The start of this thick line will be located at the top edge of the image (at
the 95-percent of the width and height of the image).
+With the commands below we will find the three necessary parameters (one
declination and two RAs).
+Just note that in astronomical images, RA increases to the left/east, which is
the reason we are using the minimum and @code{+} to find the RA starting point.
-@item -i
-@itemx --invert
-For 8-bit output types (JPEG, EPS, and PDF for example) the final value that
is stored is inverted so white becomes black and vice versa.
-The reason for this is that astronomical images usually have a very large area
of blank sky in them.
-The result will be that a large are of the image will be black.
-Note that this behavior is ideal for gray-scale images, if you want a color
image, the colors are going to be mixed up.
-@end table
+@example
+$ scalelineinkpc=20
+$ coverage=$(astfits build/sb.fits --skycoverage --quiet | awk 'NR==2')
+$ scalelinedec=$(echo $coverage | awk '@{print $4-($4-$3)*0.05@}')
+$ scalelinerastart=$(echo $coverage | awk '@{print $1+($2-$1)*0.05@}')
+$ scalelineraend=$(astcosmiccal --redshift=$redshift --arcsectandist \
+ | awk '@{start='$scalelinerastart'; \
+ width='$scalelineinkpc'/$1/3600; \
+ print start+width@}')
+@end example
-@node Drawing with vector graphics, , Pixel visualization, Invoking
astconvertt
-@subsubsection Drawing with vector graphics
-
-With the options described in this section, you can draw marks over your
to-be-published images (for example, in PDF).
-Each mark can be highly customized so they can have different shapes, colors,
line widths, text, text size, etc.
-The properties of the marks should be stored in a table that is given to the
@option{--marks} option described below.
-A fully working demo on adding marks is provided in @ref{Marking objects for
publication}.
-
-@cindex PostScript point
-@cindex Vector graphics point
-@cindex Point (Vector graphics; PostScript)
-An important factor to consider when drawing vector graphics is that vector
graphics standards (the PostScript standard in this case) use a ``point'' as
the primary unit of line thickness or font size.
-Such that 72 points correspond to 1 inch (or 2.54 centimeters).
-In other words, there are roughly 3 PostScript points in every millimeter.
-On the other hand, the pixels of the images you plan to show as the background
do not have any real size!
-Pixels are abstract and can be associated with any print-size.
-
-In ConvertType, the print-size of your final image is set with the
@option{--widthincm} option (see @ref{ConvertType input and output}).
-The value to @option{--widthincm} is the to-be width of the image in
centimeters.
-It therefore defines the thickness of lines or font sizes for your vector
graphics features (like the image border or marks).
-Just recall that we are not talking about resolution!
-Vector graphics have infinite resolution!
-We are talking about the relative thickness of the lines (or font sizes) in
relation to the pixels in your background image.
-
-@table @option
-@item -b INT
-@itemx --borderwidth=INT
-@cindex Border on an image
-The width of the border to be put around the EPS and PDF outputs in units of
PostScript points.
-If you are planning on adding a border, its thickness in relation to your
image pixel sizes is highly correlated with the value you give to the
@option{--widthincm} parameter.
-See the description at the start of this section for more.
-
-Unfortunately in the document structuring convention of the PostScript
language, the ``bounding box'' has to be in units of PostScript points with no
fractions allowed.
-So the border values only have to be specified in integers.
-To have a final border that is thinner than one PostScript point in your
document, you can ask for a larger width in ConvertType and then scale down the
output EPS or PDF file in your document preparation program.
-For example, by setting @command{width} in your @command{includegraphics}
command in @TeX{} or @LaTeX{} to be larger than the value to
@option{--widthincm}.
-Since it is vector graphics, the changes of size have no effect on the quality
of your output (pixels do not get different values).
-
-@item --bordercolor=STR
-The name of the color to use for border that will be put around the EPS and
PDF outputs.
-The list of available colors, along with their name and an example can be seen
with the following command (also see @ref{Vector graphics colors}):
+To draw coordinates over the image, we need to feed these values into PGFPlots.
+But manually entering numbers into the PGFPlots source will be very
frustrating and prone to many errors!
+Fortunately there is an easy way to do this: @LaTeX{} macros.
+New macros are defined by this @LaTeX{} command:
+@example
+\newcommand@{\macroname@}@{value@}
+@end example
+@noindent
+Anywhere that @LaTeX{} confronts @code{\macroname}, it will replace
@code{value} when building the output.
+We will have one file called @file{macros.tex} in the build directory and
define macros based on those values.
+We will use the shell's @code{printf} command to write these macro definition
lines into the macro file.
+We just have to use double backslashes in the @code{printf} command, because
backslash is a meaningful character for @code{printf}, but we want to keep one
of them.
+Also, we put a @code{\n} at the end of each line, otherwise, all the commands
will go into a single line of the macro file.
+We will also place the random `@code{ma}' string at the start of all our
@LaTeX{} macros to help identify the macros for this plot.
@example
-$ astconvertt --listcolors
+$ macros=build/macros.tex
+$ printf '\\newcommand@{\\maScaleDec@}'"@{$scalelinedec@}\n" > $macros
+$ printf '\\newcommand@{\\maScaleRAa@}'"@{$scalelinerastart@}\n" >> $macros
+$ printf '\\newcommand@{\\maScaleRAb@}'"@{$scalelineraend@}\n" >> $macros
+$ printf '\\newcommand@{\\maScaleKpc@}'"@{$scalelineinkpc@}\n" >> $macros
+$ printf '\\newcommand@{\\maCenterZ@}'"@{$redshift@}\n" >> $macros
@end example
-This option only accepts the name of the color, not the numeric identifier.
+Please open the macros file after these commands and have a look to see if
they do conform to the expected format above.
+Another set of macros we will need to feed into PGFPlots is the coordinates of
the image corners.
+Fortunately the @code{coverage} variable found above is also useful here.
+We just need to extract each item before feeding it into the macros.
+To do this, we will use AWK and keep each value with the temporary shell
variable `@code{v}'.
-@item --marks=STR
-Draw vector graphics (infinite resolution) marks over the image.
-The value to this option should be the file name of a table containing the
mark information.
-The table given to this option can have various properties for each mark in
each column.
-You can specify which column contains which property of the marks using the
options below that start with @option{--mark}.
-Only two property columns are mandatory (@option{--markcoords}), the rest are
optional.
+@example
+$ v=$(echo $coverage | awk '@{print $1@}')
+$ printf '\\newcommand@{\\maCropRAMin@}'"@{$v@}\n" >> $macros
+$ v=$(echo $coverage | awk '@{print $2@}')
+$ printf '\\newcommand@{\\maCropRAMax@}'"@{$v@}\n" >> $macros
+$ v=$(echo $coverage | awk '@{print $3@}')
+$ printf '\\newcommand@{\\maCropDecMin@}'"@{$v@}\n" >> $macros
+$ v=$(echo $coverage | awk '@{print $4@}')
+$ printf '\\newcommand@{\\maCropDecMax@}'"@{$v@}\n" >> $macros
+@end example
-The table can be in any of the Gnuastro's @ref{Recognized table formats}.
-For more on the difference between vector and raster graphics, see @ref{Raster
and Vector graphics}.
-For example, if your table with mark information is called
@file{my-marks.fits}, you can use the command below to draw red circles of
radius 5 pixels over the coordinates.
+Finally, we also need to pass some other numbers to PGFPlots: 1) the major
tick distance (in the coordinate axes that will be printed on the edge of the
image).
+We will assume 7 ticks for this image.
+2) The minimum and maximum surface brightness values that we gave to
ConvertType when making the PDF; PGFPlots will define its color-bar based on
these two values.
@example
-$ astconvertt image.fits --output=image.pdf \
- --marks=marks.fits --mode=wcs \
- --markcoords=RA,DEC
+$ v=$(echo $coverage | awk '@{print ($2-$1)/7@}')
+$ printf '\\newcommand@{\\maTickDist@}'"@{$v@}\n" >> $macros
+$ printf '\\newcommand@{\\maSBlow@}'"@{$sblow@}\n" >> $macros
+$ printf '\\newcommand@{\\maSBhigh@}'"@{$sbhigh@}\n" >> $macros
@end example
-You can highly customize each mark with different columns in @file{marks.fits}
using the @option{--mark*} options below (for example, using different colors,
different shapes, different sizes, text, and the rest on each mark).
-
-@item --markshdu=STR/INT
-The HDU (or extension) name or number of the table containing mark properties
(file given to @option{--marks}).
-This is only relevant if the table is in the FITS format and there is more
than one HDU in the FITS file.
+All the necessary numbers are now ready.
+Please copy the contents below into a file called @file{my-figure.tex}.
+This is the PGFPlots source for this particular plot.
+Besides the coordinates and scale-line, we will also add some text over the
image and an orange arrow pointing to the central object with its redshift
printed over it.
+The parameters are generally human-readable, so you should be able to get a
good feeling of every line.
+There are also comments which will show up as a different color when you copy
this into a plain-text editor.
-@item -r STR,STR
-@itemx --markcoords=STR,STR
-The column names (or numbers) containing the coordinates of each mark (in
table given to @option{--marks}).
-Only two values should be given to this option (one for each coordinate).
-They can either be given to one call (@option{--markcoords=RA,DEC}) or in
separate calls (@option{--markcoords=RA --markcoords=DEC}).
+@verbatim
+\begin{tikzpicture}
-When @option{--mode=image} the columns will be associated to the
horizontal/vertical coordinates of the image, and interpreted in units of
pixels.
-In @option{--mode=wcs}, the columns will be associated to the WCS coordinates
(typically Right Ascension and Declination, in units of degrees).
+ %% Define the coordinates and colorbar
+ \begin{axis}[
+ at={(0,0)},
+ axis on top,
+ x dir=reverse,
+ scale only axis,
+ width=\linewidth,
+ height=\linewidth,
+ minor tick num=10,
+ xmin=\maCropRAMin,
+ xmax=\maCropRAMax,
+ ymin=\maCropDecMin,
+ ymax=\maCropDecMax,
+ enlargelimits=false,
+ every tick/.style={black},
+ xtick distance=\maTickDist,
+ ytick distance=\maTickDist,
+ yticklabel style={rotate=90},
+ ylabel={Declination (degrees)},
+ xlabel={Right Ascension (degrees)},
+ ticklabel style={font=\small,
+ /pgf/number format/.cd, precision=4,/tikz/.cd},
+ x label style={at={(axis description cs:0.5,0.02)},
+ anchor=north,font=\small},
+ y label style={at={(axis description cs:0.07,0.5)},
+ anchor=south,font=\small},
+ colorbar,
+ colormap name=gray,
+ point meta min=\maSBlow,
+ point meta max=\maSBhigh,
+ colorbar style={
+ at={(1.01,1)},
+ ylabel={Surface brightness (mag/arcsec$^2$)},
+ yticklabel style={
+ /pgf/number format/.cd, precision=1, /tikz/.cd},
+ y label style={at={(axis description cs:5.3,0.5)},
+ anchor=south,font=\small},
+ },
+ ]
-@item -O STR
-@itemx --mode=STR
-The coordinate mode for interpreting the values in the columns given to the
@option{--markcoord1} and @option{--markcoord2} options.
-The acceptable values are either @code{img} (for image or pixel coordinates),
and @code{wcs} for World Coordinate System (typically RA and Dec).
-For the WCS-mode, the input image should have the necessary WCS keywords,
otherwise ConvertType will crash.
+ %% Put the image in the proper positions of the plot.
+ \addplot graphics[ xmin=\maCropRAMin, xmax=\maCropRAMax,
+ ymin=\maCropDecMin, ymax=\maCropDecMax]
+ {sb.pdf};
-@item --markshape=STR/INT
-@cindex Shapes for marks (vector graphics)
-The column name(s), or number(s), containing the shapes of each mark (in table
given to @option{--marks}).
-The shapes can either be identified by their name, or their numerical
identifier.
-If identifying them by name in a plain-text table, you need to define a string
column (see @ref{Gnuastro text table format}).
-The full list of names is shown below, with their numerical identifier in
parenthesis afterwards.
-For each shape, you can also specify properties such as the size, line width,
rotation, and color.
-See the description of the relevant @option{--mark*} option below.
+ %% Draw the scale factor.
+ \addplot[black, line width=5, name=scaleline] coordinates
+ {(\maScaleRAa,\maScaleDec) (\maScaleRAb,\maScaleDec)}
+ node [anchor=north west] {\large $\maScaleKpc$ kpc};
+ \end{axis}
-@table @code
+ %% Add some text anywhere over the plot. The text is added two
+ %% times: the first time with a white background (that with a
+ %% certain opacity), the second time just the text with opacity.
+ \node[anchor=south west, fill=white, opacity=0.5]
+ at (0.01\linewidth,0.01\linewidth)
+ {(a) Text can be added here};
+ \node[anchor=south west]
+ at (0.01\linewidth,0.01\linewidth)
+ {(a) Text can be added here};
-@item circle (1)
-A circular circumference.
-It's @emph{radius} is defined by a single size element (the first column given
to @option{--marksize}).
-Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
+ %% Add an arrow to highlight certain structures.
+ \draw [->, red!70!yellow, line width=5]
+ (0.35\linewidth,0.35\linewidth)
+ -- node [anchor=south, rotate=45]{$z=\maCenterZ$}
+ (0.45\linewidth,0.45\linewidth);
+\end{tikzpicture}
+@end verbatim
-@item plus (2)
-The plus sign (@mymath{+}).
-The @emph{length of its lines} is defined by a single size element (the first
column given to @option{--marksize}).
-Such that the intersection of its lines is on the central coordinate of the
mark.
-Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
+Finally, we need another simple @LaTeX{} source for the main PDF ``report''
that will host this figure.
+This can actually be your paper or slides for example.
+Here, we will suffice to the minimal working example.
-@item cross (3)
-A multiplication sign (@mymath{\times}).
-The @emph{length of its lines} is defined by a single size element (the first
column given to @option{--marksize}).
-Such that the intersection of its lines is on the central coordinate of the
mark.
-Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
+@verbatim
+\documentclass{article}
-@item ellipse (4)
-An elliptical circumference.
-Its major axis radius is defined by the first size element (first column given
to @option{--marksize}), and its axis ratio is defined through the second size
element (second column given to @option{--marksize}).
+%% Import the TiKZ package and activate its "external" feature.
+\usepackage{tikz}
+\usetikzlibrary{external}
+\tikzexternalize
-@item point (5)
-A point (or a filled circle).
-Its @emph{radius} is defined by a single size element (the first column given
to @option{--marksize}).
-Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
+%% PGFPlots (which uses TiKZ).
+\usepackage{pgfplots}
+\pgfplotsset{axis line style={thick}}
+\pgfplotsset{
+ /pgfplots/colormap={gray}{rgb255=(0,0,0) rgb255=(255,255,255)}
+}
-This filled circle mark is defined as a ``point'' because it is usually
relevant as a small size (or point in the whole image).
-But there is no limit on its size, so it can be arbitrarily large.
+%% Import the macros.
+\input{macros.tex}
-@item square (6)
-A square circumference.
-Its @emph{edge length} is defined by a single size element (the first column
given to @option{--marksize}).
-Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
+%% Start document.
+\begin{document}
+You can write anything here.
-@item rectangle (7)
-A rectangular circumference.
-Its length along the horizontal image axis is defined by first size element
(first column given to @option{--marksize}), and its length along the vertical
image axis is defined through the second size element (second column given to
@option{--marksize}).
+%% Add the figure and its caption.
+\begin{figure}
+ \input{my-figure.tex}
+ \caption{A demo image.}
+\end{figure}
-@item line (8)
-A line.
-The line's @emph{length} is defined by a single size element (the first column
given to @option{--marksize}.
-The line will be centered on the given coordinate.
-Like all shapes, you can rotate the line about its center using the
@option{--markrotate} column.
-Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
+%% Finish the document.
+\end{document}
+@end verbatim
-@end table
+You are now ready to create the PDF.
+But @LaTeX{} creates many temporary files, so to avoid populating our
top-level directory, we will copy the two @file{.tex} files into the build
directory, go there and run @LaTeX{}.
+Before running it though, we will first delete all the files that have the
name pattern @file{*-figure0*}, these are ``external'' files created by
TiKZ+PGFPlots, including the actual PDF of the figure.
-@item --markrotate=STR/INT
-Column name or number that contains the mark's rotation angle.
-The rotation angle should be in degrees and be relative to the horizontal axis
of the image.
+@example
+$ cp report.tex my-figure.tex build
+$ cd build
+$ rm -f *-figure0*
+$ pdflatex -shell-escape -halt-on-error report.tex
+@end example
-@item --marksize=STR[,STR]
-The column name(s), or number(s), containing the size(s) of each mark (in
table given to @option{--marks}).
-All shapes need at least one ``size'' parameter and some need two.
-For the interpretation of the size column(s) for each shape, see the
@option{--markshape} option's description.
-Since the size column(s) is (are) optional, when not specified, default values
will be used (which may be too small in larger images, so you need to change
them).
-
-By default, the values in the size column are assumed to be in the same units
as the coordinates (defined by the @option{--mode} option, described above).
-However, when the coordinates are in WCS-mode, some special cases may occur
for the size.
-@itemize
-@item
-The native WCS units (usually degrees) can be too large, and it may be more
convenient for the values in the size column(s) to be in arc-seconds.
-In this case, you can use the @option{--sizeinarcsec} option.
-@item
-Similar to above, but in units of arc-minutes.
-In this case, you can use the @option{--sizeinarcmin} option.
-@item
-Your sizes may be in units of pixels, not the WCS units.
-In this case, you can use the @option{--sizeinpix} option.
-@end itemize
-
-@item --sizeinpix
-In WCS-mode, assume that the sizes are in units of pixels.
-By default, when in WCS-mode, the sizes are assumed to be in the units of the
WCS coordinates (usually degrees).
-
-@item --sizeinarcsec
-In WCS-mode, assume that the sizes are in units of arc-seconds.
-By default, when in WCS-mode, the sizes are assumed to be in the units of the
WCS coordinates (usually degrees).
-
-@item --sizeinarcmin
-In WCS-mode, assume that the sizes are in units of arc-seconds.
-By default, when in WCS-mode, the sizes are assumed to be in the units of the
WCS coordinates (usually degrees).
-
-@item --marklinewidth=STR/INT
-Column containing the width (thickness) of the line to draw each mark.
-The line width is measured in units of ``points'' (where 72 points is one
inch), and it can be any positive floating point number.
-Therefore, the thickness (in relation to the pixels of your image) depends on
@option{--widthincm} option.
-For more, see the description at the start of this section.
+You now have the full ``report'' in @file{report.pdf}.
+Try adding some extra text on top of the figure, or in the caption and
re-running the last four commands.
+Also try changing the 20kpc scale line length to 50kpc, or try changing the
redshift, to see how the length and text of the thick scale-line will
automatically change.
+But the good news is that you also have the raw PDF of the figure that you can
use in other places.
+You can see that file in @file{report-figure0.pdf}.
-@item --markcolor=STR/INT
-Column containing the color of the mark.
-This column can be either a string or an integer.
-As a string, the color name can be written directly in your table (this
greatly helps in human readability).
-For more on string columns see @ref{Gnuastro text table format}.
-As an integer, you can simply use the numerical identifier of the column.
-You can see the list of colors with their names and numerical identifiers in
Gnuastro by running ConvertType with @option{--listcolors}, or see @ref{Vector
graphics colors}.
+In a larger paper, you can add multiple such figures (with different
@file{.tex} files that are placed in different @code{figure} environments with
different captions).
+Each figure will get a number in the build directory.
+TiKZ also allows setting a file name for each ``external'' figure (to avoid
such numbers that can be annoying if the image orders are changed).
+PGFPlots is also highly customizable, you can make a lot of changes and
customizations.
+Both
TiKZ@footnote{@url{http://mirrors.ctan.org/graphics/pgf/base/doc/pgfmanual.pdf}}
and
PGFPLots@footnote{@url{http://mirrors.ctan.org/graphics/pgf/contrib/pgfplots/doc/pgfplots.pdf}}
have wonderful manuals, so have a look trough them.
-@item --listcolors
-The list of acceptable color names, their codes and their representation can
be seen with the @option{--listcolors} option.
-By ``representation'' we mean that the color will be shown on the terminal as
the background in that column.
-But this will only be properly visible with ``true color'' or 24-bit
terminals, see @url{https://en.wikipedia.org/wiki/ANSI_escape_code,ANSI escape
sequence standard}.
-Most modern GNU/Linux terminals support 24-bit colors natively, and no
modification is necessary.
-For macOS, see the box below.
+@menu
+* Full script of annotations on figure:: All the steps in one script
+@end menu
-The printed text in standard output is in the @ref{Gnuastro text table
format}, so if you want to store this table, you can simply pipe the output to
Gnuastro's Table program and store it as a FITS table:
+@node Full script of annotations on figure, , Annotations for figure in
paper, Annotations for figure in paper
+@subsubsection Full script of annotations on figure
-@example
-$ astconvertt --listcolors | astttable -ocolors.fits
-@end example
+In @ref{Annotations for figure in paper}, we each one of the steps to add
annotations over an image were described in detail.
+So if you have understood the steps, but need to add annotations over an
image, repeating those steps individually will be annoying.
+Therefore in this section, we will summarize all the steps in a single script
that you can simply copy-paste into a text editor, configure, and run.
-@cindex iTerm
-@cindex macOS terminal 24-bit color
-@cindex Color in macOS terminals
@cartouche
@noindent
-@strong{macOS terminal colors}: as of August 2022, the default macOS terminal
(iTerm) does not support 24-bit colors!
-The output of @option{--listlines} therefore does not display the actual
colors (you can only use the color names).
-One tested solution is to install and use @url{https://iterm2.com, iTerm2},
which is free software and available in
@url{https://formulae.brew.sh/cask/iterm2, Homebrew}.
-iTerm2 is described as a successor for iTerm and works on macOS 10.14
(released in September 2018) or newer.
+@strong{Necessary files:} To run this script, you will need an image to crop
your object from (here assuming it is called @file{ah_f160w.fits} with a
certain zero point) and two @file{my-figure.tex} and @file{report.tex} files
that were fully included in @ref{Annotations for figure in paper}.
+Also, we have brought the redshift as a parameter here.
+But if the center of your image always points to your main object, you can
also include the Query command to automatically find the object's redshift from
NED.
+Alternatively, your image may already be cropped, in this case, you can remove
the cropping step and
@end cartouche
-@item --marktext=STR/INT
-Column name or number that contains the text that should be printed under the
mark.
-If the column is numeric, the number will be printed under the mark (for
example, if you want to write the magnitude or redshift of the object under the
mark showing it).
-For the precision of writing floating point columns, see
@option{--marktextprecision}.
-But if the column has a string format (for example, the name of the object
like an NGC1234), you need to define the column as a string column (see
@ref{Gnuastro text table format}).
+@verbatim
+# Parameters.
+sblow=22 # Minimum surface brightness.
+sbhigh=30 # Maximum surface brightness.
+bdir=build # Build directory location on filesystem.
+numticks=7 # Number of major ticks in each axis.
+redshift=0.619 # Redshift of object of interest.
+zeropoint=25.94 # Zero point of input image.
+scalelineinkpc=20 # Length of scale-line (in kilo parsecs).
+input=ah_f160w.fits # Name of input (to crop).
-For text with different lengths, set the length in the definition of the
column to the maximum length of the strings to be printed.
-If there are some rows or marks that don't require text, set the string in
this column to @option{n/a} (not applicable; the blank value for strings in
Gnuastro).
-When having strings with different lengths, make sure to have enough white
spaces (for the shorter strings) so the adjacent columns are not taken as part
of the string (see @ref{Gnuastro text table format}).
+# Stop the script in case of a crash.
+set -e
-@item --marktextprecision=INT
-The number of decimal digits to print after the floating point.
-This is only relevant when @option{--marktext} is given, and the selected
column has a floating point format.
+# Build directory
+if ! [ -d $bdir ]; then mkdir $bdir; fi
-@item --markfont=STR/INT
-@cindex Fonts
-@cindex Ghostscript fonts
-Column name or number that contains the font for the displayed text under the
mark.
-This is only relevant if @option{--marktext} is called.
-The font should be accessible by Ghostscript.
+# Crop out the desired region.
+crop=$bdir/crop.fits
+astcrop $input --center=53.1616278,-27.7802446 --mode=wcs \
+ --width=20/3600 --output=$crop
-If you are not familiar with the available fonts on your system's Ghostscript,
you can use the @option{--showfonts} option to see all the fonts in a custom
PDF file (one page per font).
-If you are already familiar with the font you want, but just want to make sure
about its presence (or spelling!), you can get a list (on standard output) of
all the available fonts with the @option{--listfonts} option.
-Both are described below.
+# Warp the image to larger pixels to show surface brightness better.
+scaled=$bdir/scaled.fits
+astwarp $crop --centeroncorner --scale=1/3 --output=$scaled
-@cindex Adding Ghostscript fonts
-It is possible to add custom fonts to Ghostscript as described in the
@url{https://ghostscript.com/doc/current/Fonts.htm, Fonts section} of the
Ghostscript manual.
+# Calculate the pixel area and convert image to Surface brightness.
+sb=$bdir/sb.fits
+pixarea=$(astfits $scaled --pixelareaarcsec2)
+astarithmetic $scaled $zeropoint $pixarea counts-to-sb \
+ --output=$sb
-@item --markfontsize=STR/INT
-Column name or number that contains the font size to use.
-This is only relevant if a text column has been defined (with
@option{--marktext}, described above).
-The font size is in units of ``point''s, see description at the start of this
section for more.
+# Convert the surface brightness image into PDF.
+sbpdf=$bdir/sb.pdf
+astconvertt $sb --colormap=gray --borderwidth=0 \
+ --fluxhigh=$sbhigh --fluxlow=$sblow --output=$sbpdf
-@item --showfonts
-Create a special PDF file that shows the name and shape of all available fonts
in your system's Ghostscript.
-You can use this for selecting the best font to put in the
@option{--markfonts} column.
-The available fonts can differ from one system to another (depending on how
Ghostscript was configured in that system).
-The PDF file's name is constructed by appending a @file{-fonts.pdf} to the
file name given to the @option{--output} option.
+# Specify the coordinates of the scale line (specifying a certain
+# width in kpc). We will put it on the top-right side of the image (5%
+# of the full width of the image away from the edge).
+coverage=$(astfits $sb --skycoverage --quiet | awk 'NR==2')
+scalelinedec=$(echo $coverage | awk '{print $4-($4-$3)*0.05}')
+scalelinerastart=$(echo $coverage | awk '{print $1+($2-$1)*0.05}')
+scalelineraend=$(astcosmiccal --redshift=$redshift --arcsectandist \
+ | awk '{start='$scalelinerastart'; \
+ width='$scalelineinkpc'/$1/3600; \
+ print start+width}')
-The PDF file will have one page for each font, and the sizes of the pages are
customized for showing the fonts (each page is horizontally elongated).
-This helps to better check the files by disable ``continuous'' mode in your
PDF viewer, and setting the zoom such that the width of the page corresponds to
the width of your PDF viewer.
-Simply pressing the left/right keys will then nicely show each fonts
separately.
+# Write the LaTeX macros to use in plot. Start with the thick line
+# showing tangential distance.
+macros=$bdir/macros.tex
+printf '\\newcommand{\\maScaleDec}'"{$scalelinedec}\n" > $macros
+printf '\\newcommand{\\maScaleRAa}'"{$scalelinerastart}\n" >> $macros
+printf '\\newcommand{\\maScaleRAb}'"{$scalelineraend}\n" >> $macros
+printf '\\newcommand{\\maScaleKpc}'"{$scalelineinkpc}\n" >> $macros
+printf '\\newcommand{\\maCenterZ}'"{$redshift}\n" >> $macros
-@item --listfonts
-Print (to standard output) the names of all available fonts in Ghostscript
that you can use for the @option{--markfonts} column.
-The available fonts can differ from one system to another (depending on how
Ghostscript was configured in that system).
-If you are not already familiar with the shape of each font, please use
@option{--showfonts} (described above).
+# Add image extrema for the coordinates.
+v=$(echo $coverage | awk '{print $1}')
+printf '\\newcommand{\maCropRAMin}'"{$v}\n" >> $macros
+v=$(echo $coverage | awk '{print $2}')
+printf '\\newcommand{\maCropRAMax}'"{$v}\n" >> $macros
+v=$(echo $coverage | awk '{print $3}')
+printf '\\newcommand{\maCropDecMin}'"{$v}\n" >> $macros
+v=$(echo $coverage | awk '{print $4}')
+printf '\\newcommand{\maCropDecMax}'"{$v}\n" >> $macros
-@end table
+# Distance between each tick value.
+v=$(echo $coverage | awk '{print ($2-$1)/'$numticks'}')
+printf '\\newcommand{\maTickDist}'"{$v}\n" >> $macros
+printf '\\newcommand{\maSBlow}'"{$sblow}\n" >> $macros
+printf '\\newcommand{\maSBhigh}'"{$sbhigh}\n" >> $macros
+
+# Copy the LaTeX source into the build directory and go there to run
+# it and have all the temporary LaTeX files there.
+cp report.tex my-figure.tex $bdir
+cd $bdir
+rm -f *-figure0*
+pdflatex -shell-escape -halt-on-error report.tex
+@end verbatim
+@node Invoking astconvertt, , Annotations for figure in paper, ConvertType
+@subsection Invoking ConvertType
+ConvertType will convert any recognized input file type to any specified
output type.
+The executable name is @file{astconvertt} with the following general template
+@example
+$ astconvertt [OPTION...] InputFile [InputFile2] ... [InputFile4]
+@end example
+@noindent
+One line examples:
+@example
+## Convert an image in FITS to PDF:
+$ astconvertt image.fits --output=pdf
+## Similar to before, but use the Viridis color map:
+$ astconvertt image.fits --colormap=viridis --output=pdf
+## Add markers to to highlight parts of the image
+## ('marks.fits' is a table containing coordinates)
+$ astconvertt image.fits --marks=marks.fits --output=pdf
-@node Table, Query, ConvertType, Data containers
-@section Table
+## Convert an image in JPEG to FITS (with multiple extensions
+## if it has color):
+$ astconvertt image.jpg -oimage.fits
-Tables are the high-level products of processing on low-leveler data like
images or spectra.
-For example, in Gnuastro, MakeCatalog will process the pixels over an object
and produce a catalog (or table) with the properties of each object such as
magnitudes and positions (see @ref{MakeCatalog}).
-Each one of these properties is a column in its output catalog (or table) and
for each input object, we have a row.
+## Use three 2D arrays to create an RGB JPEG output (two are
+## plain-text, the third is FITS, but all have the same size).
+$ astconvertt f1.txt f2.txt f3.fits -o.jpg
-When there are only a small number of objects (rows) and not too many
properties (columns), then a simple plain text file is mainly enough to store,
transfer, or even use the produced data.
-However, to be more efficient, astronomers have defined the FITS binary table
standard to store data in a binary format (which cannot be seen in a text
editor text).
-This can offer major advantages: the file size will be greatly reduced and the
reading and writing will also be faster (because the RAM and CPU also work in
binary).
-The acceptable table formats are fully described in @ref{Tables}.
+## Use two images and one blank for an RGB EPS output:
+$ astconvertt M31_r.fits M31_g.fits blank -oeps
-@cindex AWK
-@cindex GNU AWK
-Binary tables are not easily readable with basic plain-text editors.
-There is no fixed/unified standard on how the zero and ones should be
interpreted.
-Unix-like operating systems have flourished because of a simple fact:
communication between the various tools is based on human readable
characters@footnote{In ``The art of Unix programming'', Eric Raymond makes this
suggestion to programmers: ``When you feel the urge to design a complex binary
file format, or a complex binary application protocol, it is generally wise to
lie down until the feeling passes.''.
-This is a great book and strongly recommended, give it a look if you want to
truly enjoy your work/life in this environment.}.
-So while the FITS table standards are very beneficial for the tools that
recognize them, they are hard to use in the vast majority of available software.
-This creates limitations for their generic use.
+## Directly pass input from output of another program through Standard
+## input (not a file).
+$ cat 2darray.txt | astconvertt -oimg.fits
+@end example
-Table is Gnuastro's solution to this problem.
-Table has a large set of operations that you can directly do on any recognized
table (such as selecting certain rows and doing arithmetic on the columns).
-For operations that Table does not do internally, FITS tables (ASCII or
binary) are directly accessible to the users of Unix-like operating systems (in
particular those working the command-line or shell, see @ref{Command-line
interface}).
-With Table, a FITS table (in binary or ASCII formats) is only one command away
from AWK (or any other tool you want to use).
-Just like a plain text file that you read with the @command{cat} command.
-You can pipe the output of Table into any other tool for higher-level
processing, see the examples in @ref{Invoking asttable} for some simple
examples.
+In the sub-sections below various options that are specific to ConvertType are
grouped in different categories.
+Please see those sections for a detailed discussion on each group and its
options.
+Besides those, ConvertType also shares the @ref{Common options} with other
Gnuastro programs.
+The common options are not repeated here.
-In the sections below we describe how to effectively use the Table program.
-We start with @ref{Column arithmetic}, where the basic concept and methods of
applying arithmetic operations on one or more columns are discussed.
-Afterwards, in @ref{Operation precedence in Table}, we review the various
types of operations available and their precedence in an instance of calling
Table.
-This is a good place to get a general feeling of all the things you can do
with Table.
-Finally, in @ref{Invoking asttable}, we give some examples and describe each
option in Table.
@menu
-* Printing floating point numbers:: Optimal storage of floating point types.
-* Vector columns:: How to keep more than one value in each column.
-* Column arithmetic:: How to do operations on table columns.
-* Operation precedence in Table:: Order of running options in Table.
-* Invoking asttable:: Options and arguments to Table.
+* ConvertType input and output:: Input/output file names and formats.
+* Pixel visualization:: Visualizing the pixels in the output.
+* Drawing with vector graphics:: Adding marks in many shapes and colors over
the pixels.
@end menu
-@node Printing floating point numbers, Vector columns, Table, Table
-@subsection Printing floating point numbers
-
-@cindex Floating point numbers
-@cindex Printing floating point numbers
-Many of the columns containing astronomical data will contain floating point
numbers (those that aren't an integer, like @mymath{1.23} or
@mymath{4.56\times10^{-7}}).
-However, printing (for human readability) of floating point numbers has some
intricacies that we will explain in this section.
-For a basic introduction to different types of integers or floating points,
see @ref{Numeric data types}.
-
-It may be tempting to simply use 64-bit floating points all the time and avoid
this section over all.
-But have in mind that compared to 32-bit floating point type, a 64-bit
floating point type will consume double the storage, double the RAM and will
take almost double the time for processing.
-So when the statistical precision of your numbers is less than that offered by
32-bit floating point precision, it is much better to store them in this format.
+@node ConvertType input and output, Pixel visualization, Invoking astconvertt,
Invoking astconvertt
+@subsubsection ConvertType input and output
-Within almost all commonly used CPUs of today, numbers (including integers or
floating points) are stored in binary base-2 format (where the only digits that
can be used to represent the number are 0 and 1).
-However, we (humans) are use to numbers in base-10 (where we have 10 digits:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9).
-For integers, there is a one-to-one correspondence between a base-2 and
base-10 representation.
-Therefore, converting a base-10 integer (that you will be giving as an option
value when running a Gnuastro program, for example) to base-2 (that the
computer will store in memory), or vice-versa, will not cause any loss of
information for integers.
+@cindex Standard input
+At most four input files (one for each color channel for formats that allow
it) are allowed in ConvertType.
+The first input dataset can either be a file, or come from Standard input (see
@ref{Standard input} and @ref{Recognized file formats}).
-The problem is that floating point numbers don't have such a one-to-one
correspondence between the two notations.
-The full discussion on how floating point numbers are stored in binary format
is beyond the scope of this book.
-But please have a look at the corresponding
@url{https://en.wikipedia.org/wiki/Floating-point_arithmetic, Wikipedia
article} to get a rough feeling about the complexity.
-Of course, if you are interested in the details, that Wikipedia article should
be a good starting point for further reading.
+The order of multiple input files is important.
+After reading the input file(s) the number of color channels in all the inputs
will be used to define which color space to use for the outputs and how each
color channel is interpreted: 1 (for grayscale), 3 (for RGB) and 4 (for CMYK)
input channels.
+For more on pixel color channels, see @ref{Pixel colors}.
+Depending on the format of the input(s), the number of input files can differ.
-@cindex IEEE 754 (floating point)
-The most common convention for storing floating point numbers in digital
storage is IEEE Standard for Floating-Point Arithmetic;
@url{https://en.wikipedia.org/wiki/IEEE_754, IEEE 754}.
-In short, the full width (in bits) assigned to that type (for example the 32
bits allocated for 32-bit floating point types) is divided into separate
components: The first bit is the ``sign'' (specifying if the number is negative
or positive).
-In 32-bit floats, the next 8 bits are the ``exponent'' and finally (again, in
32-bit floats), the ``fraction'' is stored in the next 23 bits.
-For example see
@url{https://commons.wikimedia.org/wiki/File:Float_example.svg, this image on
Wikipedia}.
+For example, if you plan to build an RGB PDF and your three channels are in
the first HDU of @file{r.fits}, @file{g.fits} and @file{b.fits}, then you can
simply call MakeProfiles like this:
-@cindex Decimal digits
-@cindex Precision of floats
-In IEEE 754, around zero, the base-2 and base-10 representations approximately
match.
-However, as we go away from 0, you will loose precision.
-The important concept in understanding the precision of floating point numbers
is ``decimal digits'', or the number of digits in the number, independent of
where the decimal point is.
-For example @mymath{1.23} has three decimal digits and
@mymath{4.5678\times10^9} has 5 decimal digits.
-According to IEEE
754@footnote{@url{https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats}},
32-bit and 64-bit floating point numbers can accurately (statistically)
represent a floating point with 7.22 and 15.95 decimal digits respectively.
+@example
+$ astconvertt r.fits g.fits b.fits -g1 --output=rgb.pdf
+@end example
-@cartouche
@noindent
-@strong{Should I store my columns as 32-bit or 64-bit floating point type?} If
your floating point numbers have 7 decimal digits or less (for example noisy
image pixel values, measured star or galaxy magnitudes, and anything that is
derived from them like galaxy mass and etc), you can safely use 32-bit
precision (the statistical error on the measurements is usually significantly
larger than 7 digits!).
-However, some columns require more digits; thus 64-bit precision.
-For example, RA or Dec with more than one arcsecond accuracy: the degrees can
have 3 digits, and 1 arcsecond is @mymath{1/3600\sim0.0003} of a degree,
requiring 4 more digits).
-You can use the @ref{Numerical type conversion operators} of @ref{Column
arithmetic} to convert your columns to a certain type for storage.
-@end cartouche
+However, if the three color channels are in three extensions (assuming the
HDUs are respectively named @code{R}, @code{G} and @code{B}) of a single file
(assuming @file{channels.fits}), you should run it like this:
-The discussion above was for the storage of floating point numbers.
-When printing floating point numbers in a human-friendly format (for example,
in a plain-text file or on standard output in the command-line), the computer
has to convert its internal base-2 representation to a base-10 representation.
-This second conversion may cause a small discrepancy between the stored and
printed values.
+@example
+$ astconvertt channels.fits -hR -hG -hB --output=rgb.pdf
+@end example
-@cartouche
@noindent
-@strong{Use FITS tables as output of measurement programs:} When you are doing
a measurement to produce a catalog (for example with @ref{MakeCatalog}) set the
output to be a FITS table (for example @option{--output=mycatalog.fits}).
-A FITS binary table will store the same the base-2 number that was measured by
the CPU.
-However, if you choose to store the output table as a plain-text table, you
risk loosing information due to the human friendly base-10 floating point
conversion (which is necessary in a plain-text output).
-@end cartouche
+On the other hand, if the channels are already in a multi-channel format (like
JPEG), you can simply provide that file:
-To customize how columns containing floating point values are printed (in a
plain-text output file, or in the standard output in your terminal), Table has
four options for the two different types: @option{--txtf32format},
@option{--txtf32precision}, @option{--txtf64format} and
@option{--txtf64precision}.
-They are fully described in @ref{Invoking asttable}.
+@example
+$ astconvertt image.jpg --output=rgb.pdf
+@end example
-@cartouche
@noindent
-@strong{Summary:} it is therefore recommended to always store your tables as
FITS (binary) tables.
-To view the contents of the table on the command-line or to feed it to a
program that doesn't recognize FITS tables, you can use the four options above
for a custom base-10 conversion that will not cause any loss of data.
-@end cartouche
-
-@node Vector columns, Column arithmetic, Printing floating point numbers, Table
-@subsection Vector columns
+If multiple channels are given as input, and the output format does not
support multiple color channels (for example, FITS), ConvertType will put the
channels in different HDUs, like the example below.
+After running the @command{astfits} command, if your JPEG file was not
grayscale (single channel), you will see multiple HDUs in @file{channels.fits}.
-@cindex Vector columns
-@cindex Columns (Vector)
-@cindex Multi-value columns (vector)
-In its most common format, each column of a table only has a single value in
each row.
-For example, we usually have one column for the magnitude, another column for
the RA (Right Ascension) and yet another column for the DEC (Declination) of a
set of galaxies/stars (where each galaxy is represented by one row in the
table).
-This common single-valued column format is sufficient in many scenarios.
-However, in some situations (like those below) it would help to have multiple
values for each row in each column, not just one.
+@example
+$ astconvertt image.jpg --output=channels.fits
+$ astfits channels.fits
+@end example
-@itemize
-@item
-@cindex MUSE
-@cindex Spectrum
-@cindex Radial profile
-Conceptually: the various numbers are ``connected'' to each other.
-In other words, their order and position in relation to each other matters.
-Common examples in astronomy are the radial profiles of each galaxy in your
catalog, or their spectrum.
-For example, each
MUSE@footnote{@url{https://www.eso.org/sci/facilities/develop/instruments/muse.html}}
spectra has 3681 points (with a sampling of of 1.25 Angstroms).
+As shown above, the output's file format will be interpreted from the name
given to the @option{--output} option (as a common option to all Gnuastro
programs, for the description of @option{--output}, see @ref{Input output
options}).
+It can either be given on the command-line or in any of the configuration
files (see @ref{Configuration files}).
+When the output suffix is not recognized, it will default to plain text
format, see @ref{Recognized file formats}.
-Dealing with this many separate measurements as separate columns in your table
is very annoying and prone to error: you don't want to forget moving some of
them in an output table for further analysis, mistakenly change their order, or
do some operation only on a sub-set of them.
+If there is one input dataset (color channel) the output will be gray-scale.
+When three input datasets (color channels) are given, they are respectively
considered to be the red, green and blue color channels.
+Finally, if there are four color channels they will be cyan, magenta, yellow
and black (CMYK colors).
-@item
-Technically: in the FITS standard, you can only store a maximum of 999 columns
in a FITS table.
-Therefore, if you have more than 999 data points for each galaxy (like the
MUSE spectra example above), it is impossible to store each point in one table
as separate columns.
-@end itemize
+The value to @option{--output} (or @option{-o}) can be either a full file name
or just the suffix of the desired output format.
+In the former case (full name), it will be directly used for the output's file
name.
+In the latter case, the name of the output file will be set based on the
automatic output guidelines, see @ref{Automatic output}.
+Note that the suffix name can optionally start with a @file{.} (dot), so for
example, @option{--output=.jpg} and @option{--output=jpg} are equivalent.
+See @ref{Recognized file formats}.
-To address these problems, the FITS standard has defined the concept of
``vector'' columns in its Binary table format (ASCII FITS tables don't support
vector columns, but Gnuastro's plain-text format does, as described here).
-Within each row of a single vector column, we can store any number of data
points (like the MUSE spectra above or the full radial profile of each galaxy).
-All the values in a vector column have to have the same @ref{Numeric data
types}, and the number of elements within each vector column is the same for
all rows.
+The relevant options for input/output formats are described below:
-By grouping conceptually similar data points (like a spectrum) in one vector
column, we can significantly reduce the number of columns and make it much more
manageable, without loosing any information!
-To demonstrate the vector column features of Gnuastro's Table program, let's
start with a randomly generated small (5 rows and 3 columns) catalog.
-This will allows us to show the outputs of each step here, but you can apply
the same concept to vectors with any number of columns.
+@table @option
+@item -h STR/INT
+@itemx --hdu=STR/INT
+Input HDU name or counter (counting from 0) for each input FITS file.
+If the same HDU should be used from all the FITS files, you can use the
@option{--globalhdu} option described below.
+In ConvertType, it is possible to call the HDU option multiple times for the
different input FITS or TIFF files in the same order that they are called on
the command-line.
+Note that in the TIFF standard, one `directory' (similar to a FITS HDU) may
contain multiple color channels (for example, when the image is in RGB).
-With the command below, we use @code{seq} to generate a single-column table
that is piped to Gnuastro's Table program.
-Table then uses column arithmetic to generate three columns with random values
from that column (for more, see @ref{Column arithmetic}).
-Each column becomes noisy, with standard deviations of 2, 5 and 10.
-Finally, we will add metadata to each column, giving each a different name
(using names is always the best way to work with columns):
+Except for the fact that multiple calls are possible, this option is identical
to the common @option{--hdu} in @ref{Input output options}.
+The number of calls to this option cannot be less than the number of input
FITS or TIFF files, but if there are more, the extra HDUs will be ignored, note
that they will be read in the order described in @ref{Configuration file
precedence}.
-@example
-$ seq 1 5 \
- | asttable -c'arith $1 2 mknoise-sigma f32' \
- -c'arith $1 5 mknoise-sigma f32' \
- -c'arith $1 10 mknoise-sigma f32' \
- --colmetadata=1,abc,none,"First column." \
- --colmetadata=2,def,none,"Second column." \
- --colmetadata=3,ghi,none,"Third column." \
- --output=table.fits
-@end example
+Unlike CFITSIO, libtiff (which is used to read TIFF files) only recognizes
numbers (counting from zero, similar to CFITSIO) for `directory' identification.
+Hence the concept of names is not defined for the directories and the values
to this option for TIFF files must be numbers.
-With the command below, let's have a look at the table.
-When you run it, you will have a different random number generator seed, so
the numbers will be slightly different.
-For making reproducible random numbers, see @ref{Generating random numbers}.
-The @option{-Y} option is used for more easily readable numbers (without it,
floating point numbers are written in scientific notation, for more see
@ref{Printing floating point numbers}) and with the @option{-O} we are asking
Table to also print the metadata.
-For more on Table's options, see @ref{Invoking asttable} and for seeing how
the short options can be merged (such that @option{-Y -O} is identical to
@option{-YO}), see @ref{Options}.
+@item -g STR/INT
+@itemx --globalhdu=STR/INT
+Use the value given to this option (a HDU name or a counter, starting from 0)
for the HDU identifier of all the input FITS files.
+This is useful when all the inputs are distributed in different files, but
have the same HDU in those files.
-@example
-$ asttable table.fits -YO
-# Column 1: abc [none,f32,] First column.
-# Column 2: def [none,f32,] Second column.
-# Column 3: ghi [none,f32,] Third column.
-1.074 5.535 -4.464
-0.606 -2.011 15.397
-1.475 1.811 5.687
-2.248 7.663 -7.789
-6.355 17.374 6.767
-@end example
+@item -w FLT
+@itemx --widthincm=FLT
+The width of the output in centimeters.
+This is only relevant for those formats that accept such a width as metadata
(not FITS or plain-text for example), see @ref{Recognized file formats}.
+For most digital purposes, the number of pixels is far more important than the
value to this parameter because you can adjust the absolute width (in inches or
centimeters) in your document preparation program.
-We see that indeed, it has three columns, with our given names.
-Now, let's assume that you want to make a two-element vector column from the
values in the @code{def} and @code{ghi} columns.
-To do that, you can use the @option{--tovector} option like below.
-As the name suggests, @option{--tovector} will merge the rows of the two
columns into one vector column with multiple values in each row.
+@item -x
+@itemx --hex
+@cindex ASCII85 encoding
+@cindex Hexadecimal encoding
+Use Hexadecimal encoding in creating EPS output.
+By default the ASCII85 encoding is used which provides a much better
compression ratio.
+When converted to PDF (or included in @TeX{} or @LaTeX{} which is finally
saved as a PDF file), an efficient binary encoding is used which is far more
efficient than both of them.
+The choice of EPS encoding will thus have no effect on the final PDF.
-@example
-$ asttable table.fits -YO --tovector=def,ghi
-# Column 1: abc [none,f32 ,] First column.
-# Column 2: def-VECTOR [none,f32(2),] Vector by merging multiple cols.
-1.074 5.535 -4.464
-0.606 -2.011 15.397
-1.475 1.811 5.687
-2.248 7.663 -7.789
-6.355 17.374 6.767
-@end example
+So if you want to transfer your EPS files (for example, if you want to submit
your paper to arXiv or journals in PostScript), their storage might become
important if you have large images or lots of small ones.
+By default ASCII85 encoding is used which offers a much better compression
ratio (nearly 40 percent) compared to Hexadecimal encoding.
-@cindex Tokens
-If you ignore the metadata, this doesn't seem to have changed anything!
-You see that each line of numbers still has three ``tokens'' (to distinguish
them from ``columns'').
-But once you look at the metadata, you only see metadata for two columns, not
three.
-If you look closely, the numeric data type of the newly added fourth column is
`@code{f32(2)}' (look above; previously it was @code{f32}).
-The @code{(2)} shows that the second column contains two numbers/tokens not
one.
-If your vector column consisted of 3681 numbers, this would be
@code{f32(3681)}.
-Looking again at the metadata, we see that @option{--tovector} has also
created a new name and comments for the new column.
-This is done all the time to avoid confusion with the old columns.
+@item -u INT
+@itemx --quality=INT
+@cindex JPEG compression quality
+@cindex Compression quality in JPEG
+@cindex Quality of compression in JPEG
+The quality (compression) of the output JPEG file with values from 0 to 100
(inclusive).
+For other formats the value to this option is ignored.
+Note that only in gray-scale (when one input color channel is given) will this
actually be the exact quality (each pixel will correspond to one input value).
+If it is in color mode, some degradation will occur.
+While the JPEG standard does support loss-less graphics, it is not commonly
supported.
+@end table
-Let's confirm that the newly added column is indeed a single column but with
two values.
-To do this, with the command below, we'll write the output into a FITS table.
-In the same command, let's also give a more suitable name for the new
merged/vector column).
-We can get a first confirmation by looking at the table's metadata in the
second command below:
+@node Pixel visualization, Drawing with vector graphics, ConvertType input and
output, Invoking astconvertt
+@subsubsection Pixel visualization
-@example
-$ asttable table.fits -YO --tovector=def,ghi --output=vec.fits \
- --colmetadata=2,vector,nounits,"New vector column."
+The main goal of ConvertType is to visualize pixels to/from print or web
friendly formats.
-$ asttable vec.fits -i
---------
-vec.fits (hdu: 1)
-------- ----- ---- -------
-No.Name Units Type Comment
-------- ----- ---- -------
-1 abc none float32 First column.
-2 vector nounits float32(2) New vector column.
---------
-Number of rows: 5
---------
-@end example
+Astronomical data usually have a very large dynamic range (difference between
maximum and minimum value) and different subjects might be better demonstrated
with a limited flux range.
-@noindent
-A more robust confirmation would be to print the values in the newly added
@code{vector} column.
-As expected, asking for a single column with @option{--column} (or
@option{-c}) will given us two numbers per row/line (instead of one!).
+@table @option
+@item --colormap=STR[,FLT,...]
+The color map to visualize a single channel.
+The first value given to this option is the name of the color map, which is
shown below.
+Some color maps can be configured.
+In this case, the configuration parameters are optionally given as numbers
following the name of the color map for example, see @option{hsv}.
+The table below contains the usable names of the color maps that are currently
supported:
-@example
-$ asttable vec.fits -c vector -YO
-# Column 1: vector [nounits,f32(2),] New vector column.
- 5.535 -4.464
--2.011 15.397
- 1.811 5.687
- 7.663 -7.789
- 17.374 6.767
-@end example
-
-If you want to keep the original single-valued columns that went into the
vector column, you can use the @code{--keepvectfin} option (read it as ``KEEP
VECtor To/From Inputs''):
-
-@example
-$ asttable table.fits -YO --tovector=def,ghi --keepvectfin \
- --colmetadata=4,vector,nounits,"New vector column."
-# Column 1: abc [none ,f32 ,] First column.
-# Column 2: def [none ,f32 ,] Second column.
-# Column 3: ghi [none ,f32 ,] Third column.
-# Column 4: vector [nounits,f32(2),] New vector column.
-1.074 5.535 -4.464 5.535 -4.464
-0.606 -2.011 15.397 -2.011 15.397
-1.475 1.811 5.687 1.811 5.687
-2.248 7.663 -7.789 7.663 -7.789
-6.355 17.374 6.767 17.374 6.767
-@end example
-
-Now that you know how to create vector columns, let's assume you have the
inverse scenario: you want to extract one of the values of a vector column into
a separate single-valued column.
-To do this, you can use the @option{--fromvector} option.
-The @option{--fromvector} option takes the name (or counter) of a vector
column, followed by any number of integer counters (counting from 1).
-It will extract those elements into separate single-valued columns.
-For example, let's assume you want to extract the second element of the
@code{defghi} column in the file you made before:
+@table @option
+@item gray
+@itemx grey
+@cindex Colorspace, gray-scale
+Grayscale color map.
+This color map does not have any parameters.
+The full dataset range will be scaled to 0 and @mymath{2^8-1=255} to be stored
in the requested format.
-@example
-$ asttable vec.fits --fromvector=vector,2 -YO
-# Column 1: abc [none ,f32,] First column.
-# Column 2: vector-2 [nounits,f32,] New vector column.
-1.074 -4.464
-0.606 15.397
-1.475 5.687
-2.248 -7.789
-6.355 6.767
-@end example
+@item hsv
+@cindex Colorspace, HSV
+@cindex Hue, saturation, value
+@cindex HSV: Hue Saturation Value
+Hue, Saturation,
Value@footnote{@url{https://en.wikipedia.org/wiki/HSL_and_HSV}} color map.
+If no values are given after the name (@option{--colormap=hsv}), the dataset
will be scaled to 0 and 360 for hue covering the full spectrum of colors.
+However, you can limit the range of hue (to show only a special color range)
by explicitly requesting them after the name (for example,
@option{--colormap=hsv,20,240}).
-@noindent
-Just like the case with @option{--tovector} above, if you want to keep the
input vector column, use @option{--keepvectfin}.
-This feature is useful in scenarios where you want to select some rows based
on a single element (or multiple) of the vector column.
+The mapping of a single-channel dataset to HSV is done through the Hue and
Value elements: Lower dataset elements have lower ``value'' @emph{and} lower
``hue''.
+This creates darker colors for fainter parts, while also respecting the range
of colors.
-@cartouche
-@noindent
-@strong{Vector columns and FITS ASCII tables:} As mentioned above, the FITS
standard only recognizes vector columns in its Binary table format (the default
FITS table format in Gnuastro).
-You can still use the @option{--tableformat=fits-ascii} option to write your
tables in the FITS ASCII format (see @ref{Input output options}).
-In this case, if a vector column is present, it will be written as separate
single-element columns to avoid loosing information (as if you run called
@option{--fromvector} on all the elements of the vector column).
-A warning is printed if this occurs.
-@end cartouche
+@item viridis
+@cindex matplotlib
+@cindex Colormap: Viridis
+@cindex Viridis: Colormap
+Viridis is the default colormap of the popular Matplotlib module of Python and
available in many other visualization tools like PGFPlots.
-For an application of the vector column concepts introduced here on MUSE data,
see the 3D data cube tutorial and in particular these two sections: @ref{3D
measurements and spectra} and @ref{Extracting a single spectrum and plotting
it}.
+@item sls
+@cindex DS9
+@cindex SAO DS9
+@cindex SLS Color
+@cindex Colormap: SLS
+The SLS color range, taken from the commonly used @url{http://ds9.si.edu,SAO
DS9}.
+The advantage of this color range is that it starts with black, going into
dark blue and finishes with the brighter colors of red and white.
+So unlike the HSV color range, it includes black and white and brighter colors
(like yellow, red) show the larger values.
-@node Column arithmetic, Operation precedence in Table, Vector columns, Table
-@subsection Column arithmetic
+@item sls-inverse
+@cindex Colormap: SLS-inverse
+The inverse of the SLS color map (see above), where the lowest value
corresponds to white and the highest value is black.
+While SLS is good for visualizing on the monitor, SLS-inverse is good for
printing.
+@end table
-In many scenarios, you want to apply some kind of operation on the columns and
save them in another table or feed them into another program.
-With Table you can do a rich set of operations on the contents of one or more
columns in a table, and save the resulting values as new column(s) in the
output table.
-For seeing the precedence of Column arithmetic in relation to other Table
operators, see @ref{Operation precedence in Table}.
+@item --rgbtohsv
+When there are three input channels and the output is in the FITS format,
interpret the three input channels as red, green and blue channels (RGB) and
convert them to the hue, saturation, value (HSV) color space.
-To enable column arithmetic, the first 6 characters of the value to
@option{--column} (@code{-c}) should be the activation word `@option{arith }'
(note the space character in the end, after `@code{arith}').
-After the activation word, you can use reverse polish notation to identify the
operators and their operands, see @ref{Reverse polish notation}.
-Just note that white-space characters are used between the tokens of the
arithmetic expression and that they are meaningful to the command-line
environment.
-Therefore the whole expression (including the activation word) has to be
quoted on the command-line or in a shell script (see the examples below).
+The currently supported output formats of ConvertType do not have native
support for HSV.
+Therefore this option is only supported when the output is in FITS format and
each of the hue, saturation and value arrays can be saved as one FITS extension
in the output for further analysis (for example, to select a certain color).
-To identify a column you can directly use its name, or specify its number
(counting from one, see @ref{Selecting table columns}).
-When you are giving a column number, it is necessary to prefix the number with
a @code{$}, similar to AWK.
-Otherwise the number is not distinguishable from a constant number to use in
the arithmetic operation.
+@item -c STR
+@itemx --change=STR
+@cindex Change converted pixel values
+(@option{=STR}) Change pixel values with the following format
@option{"from1:to1, from2:to2,..."}.
+This option is very useful in displaying labeled pixels (not actual data
images which have noise) like segmentation maps.
+In labeled images, usually a group of pixels have a fixed integer value.
+With this option, you can manipulate the labels before the image is displayed
to get a better output for print or to emphasize on a particular set of labels
and ignore the rest.
+The labels in the images will be changed in the same order given.
+By default first the pixel values will be converted then the pixel values will
be truncated (see @option{--fluxlow} and @option{--fluxhigh}).
-For example, with the command below, the first two columns of
@file{table.fits} will be printed along with a third column that is the result
of multiplying the first column with @mymath{10^{10}} (for example, to convert
wavelength from Meters to Angstroms).
-Note that without the `@key{$}', it is not possible to distinguish between
``1'' as a column-counter, or ``1'' as a constant number to use in the
arithmetic operation.
-Also note that because of the significance of @key{$} for the command-line
environment, the single-quotes are the recommended quoting method (as in an AWK
expression), not double-quotes (for the significance of using single quotes see
the box below).
+You can use any number for the values irrespective of your final output, your
given values are stored and used in the double precision floating point format.
+So for example, if your input image has labels from 1 to 20000 and you only
want to display those with labels 957 and 11342 then you can run ConvertType
with these options:
@example
-$ asttable table.fits -c1,2 -c'arith $1 1e10 x'
+$ astconvertt --change=957:50000,11342:50001 --fluxlow=5e4 \
+ --fluxhigh=1e5 segmentationmap.fits --output=jpg
@end example
-@cartouche
@noindent
-@strong{Single quotes when string contains @key{$}}: On the command-line, or
in shell-scripts, @key{$} is used to expand variables, for example, @code{echo
$PATH} prints the value (a string of characters) in the variable @code{PATH},
it will not simply print @code{$PATH}.
-This operation is also permitted within double quotes, so @code{echo "$PATH"}
will produce the same output.
-This is good when printing values, for example, in the command below,
@code{$PATH} will expand to the value within it.
+While the output JPEG format is only 8 bit, this operation is done in an
intermediate step which is stored in double precision floating point.
+The pixel values are converted to 8-bit after all operations on the input
fluxes have been complete.
+By placing the value in double quotes you can use as many spaces as you like
for better readability.
-@example
-$ echo "My path is: $PATH"
-@end example
+@item -C
+@itemx --changeaftertrunc
+Change pixel values (with @option{--change}) after truncation of the flux
values, by default it is the opposite.
-If you actually want to return the literal string @code{$PATH}, not the value
in the @code{PATH} variable (like the scenario here in column arithmetic), you
should put it in single quotes like below.
-The printed value here will include the @code{$}, please try it to see for
yourself and compare to above.
+@item -L FLT
+@itemx --fluxlow=FLT
+The minimum flux (pixel value) to display in the output image, any pixel value
below this value will be set to this value in the output.
+If the value to this option is the same as @option{--fluxhigh}, then no flux
truncation will be applied.
+Note that when multiple channels are given, this value is used for all the
color channels.
-@example
-$ echo 'My path is: $PATH'
-@end example
+@item -H FLT
+@itemx --fluxhigh=FLT
+The maximum flux (pixel value) to display in the output image, see
+@option{--fluxlow}.
-Therefore, when your column arithmetic involves the @key{$} sign (to specify
columns by number), quote your @code{arith } string with a single quotation
mark.
-Otherwise you can use both single or double quotes.
-@end cartouche
+@item -m INT
+@itemx --maxbyte=INT
+This is only used for the JPEG and EPS output formats which have an 8-bit
space for each channel of each pixel.
+The maximum value in each pixel can therefore be @mymath{2^8-1=255}.
+With this option you can change (decrease) the maximum value.
+By doing so you will decrease the dynamic range.
+It can be useful if you plan to use those values for other purposes.
+@item -A
+@itemx --forcemin
+Enforce the value of @option{--fluxlow} (when it is given), even if it is
smaller than the minimum of the dataset and the output is format supporting
color.
+This is particularly useful when you are converting a number of images to a
common image format like JPEG or PDF with a single command and want them all to
have the same range of colors, independent of the contents of the dataset.
+Note that if the minimum value is smaller than @option{--fluxlow}, then this
option is redundant.
+@cindex PDF
+@cindex EPS
+@cindex PostScript
+By default, when the dataset only has two values, @emph{and} the output format
is PDF or EPS, ConvertType will use the PostScript optimization that allows
setting the pixel values per bit, not byte (@ref{Recognized file formats}).
+This can greatly help reduce the file size.
+However, when @option{--fluxlow} or @option{--fluxhigh} are called, this
optimization is disabled: even though there are only two values (is binary),
the difference between them does not correspond to the full contrast of black
and white.
+@item -B
+@itemx --forcemax
+Similar to @option{--forcemin}, but for the maximum.
-@cartouche
-@noindent
-@strong{Manipulate all columns in one call using @key{$_all}}: Usually we
manipulate one column in one call of column arithmetic.
-For instance, with the command below the elements of the @code{AWAV} column
will be sumed.
+@item -i
+@itemx --invert
+For 8-bit output types (JPEG, EPS, and PDF for example) the final value that
is stored is inverted so white becomes black and vice versa.
+The reason for this is that astronomical images usually have a very large area
of blank sky in them.
+The result will be that a large are of the image will be black.
+Note that this behavior is ideal for gray-scale images, if you want a color
image, the colors are going to be mixed up.
+@end table
-@example
-$ asttable table.fits -c'arith AWAV sumvalue'
-@end example
+@node Drawing with vector graphics, , Pixel visualization, Invoking
astconvertt
+@subsubsection Drawing with vector graphics
-But sometimes, we want to manipulate more than one column with the same
expression.
-For example we want to sum all the elements of all the columns.
-In this case we could use the following command (assuming that the table has
four different @code{AWAV-*} columns):
+With the options described in this section, you can draw marks over your
to-be-published images (for example, in PDF).
+Each mark can be highly customized so they can have different shapes, colors,
line widths, text, text size, etc.
+The properties of the marks should be stored in a table that is given to the
@option{--marks} option described below.
+A fully working demo on adding marks is provided in @ref{Marking objects for
publication}.
-@example
-$ asttable table.fits -c'arith AWAV-1 sumvalue' \
- -c'arith AWAV-2 sumvalue' \
- -c'arith AWAV-3 sumvalue' \
- -c'arith AWAV-4 sumvalue'
-@end example
+@cindex PostScript point
+@cindex Vector graphics point
+@cindex Point (Vector graphics; PostScript)
+An important factor to consider when drawing vector graphics is that vector
graphics standards (the PostScript standard in this case) use a ``point'' as
the primary unit of line thickness or font size.
+Such that 72 points correspond to 1 inch (or 2.54 centimeters).
+In other words, there are roughly 3 PostScript points in every millimeter.
+On the other hand, the pixels of the images you plan to show as the background
do not have any real size!
+Pixels are abstract and can be associated with any print-size.
-To avoid repetition and mistakes, instead of using column arithmetic many
times, we can use the @code{$_all} identifier.
-When column arithmetic confronts this special string, it will repeat the
expression for all the columns in the input table.
-Therefore the command above can be written as:
+In ConvertType, the print-size of your final image is set with the
@option{--widthincm} option (see @ref{ConvertType input and output}).
+The value to @option{--widthincm} is the to-be width of the image in
centimeters.
+It therefore defines the thickness of lines or font sizes for your vector
graphics features (like the image border or marks).
+Just recall that we are not talking about resolution!
+Vector graphics have infinite resolution!
+We are talking about the relative thickness of the lines (or font sizes) in
relation to the pixels in your background image.
+
+@table @option
+@item -b INT
+@itemx --borderwidth=INT
+@cindex Border on an image
+The width of the border to be put around the EPS and PDF outputs in units of
PostScript points.
+If you are planning on adding a border, its thickness in relation to your
image pixel sizes is highly correlated with the value you give to the
@option{--widthincm} parameter.
+See the description at the start of this section for more.
+
+Unfortunately in the document structuring convention of the PostScript
language, the ``bounding box'' has to be in units of PostScript points with no
fractions allowed.
+So the border values only have to be specified in integers.
+To have a final border that is thinner than one PostScript point in your
document, you can ask for a larger width in ConvertType and then scale down the
output EPS or PDF file in your document preparation program.
+For example, by setting @command{width} in your @command{includegraphics}
command in @TeX{} or @LaTeX{} to be larger than the value to
@option{--widthincm}.
+Since it is vector graphics, the changes of size have no effect on the quality
of your output (pixels do not get different values).
+
+@item --bordercolor=STR
+The name of the color to use for border that will be put around the EPS and
PDF outputs.
+The list of available colors, along with their name and an example can be seen
with the following command (also see @ref{Vector graphics colors}):
@example
-$ asttable table.fits -c'arith $_all sumvalue'
+$ astconvertt --listcolors
@end example
-@end cartouche
+This option only accepts the name of the color, not the numeric identifier.
-Alternatively, if the columns have meta-data and the first two are
respectively called @code{AWAV} and @code{SPECTRUM}, the command above is
equivalent to the command below.
-Note that the character `@key{$}' is no longer necessary in this scenario
(because names will not be confused with numbers):
+@item --marks=STR
+Draw vector graphics (infinite resolution) marks over the image.
+The value to this option should be the file name of a table containing the
mark information.
+The table given to this option can have various properties for each mark in
each column.
+You can specify which column contains which property of the marks using the
options below that start with @option{--mark}.
+Only two property columns are mandatory (@option{--markcoords}), the rest are
optional.
+
+The table can be in any of the Gnuastro's @ref{Recognized table formats}.
+For more on the difference between vector and raster graphics, see @ref{Raster
and Vector graphics}.
+For example, if your table with mark information is called
@file{my-marks.fits}, you can use the command below to draw red circles of
radius 5 pixels over the coordinates.
@example
-$ asttable table.fits -cAWAV,SPECTRUM -c'arith AWAV 1e10 x'
+$ astconvertt image.fits --output=image.pdf \
+ --marks=marks.fits --mode=wcs \
+ --markcoords=RA,DEC
@end example
-Comparison of the two commands above clearly shows why it is recommended to
use column names instead of numbers.
-When the columns have descriptive names, the command/script actually becomes
much more readable, describing the intent of the operation.
-It is also independent of the low-level table structure: for the second
command, the column numbers of the @code{AWAV} and @code{SPECTRUM} columns in
@file{table.fits} is irrelevant.
+You can highly customize each mark with different columns in @file{marks.fits}
using the @option{--mark*} options below (for example, using different colors,
different shapes, different sizes, text, and the rest on each mark).
-Column arithmetic changes the values of the data within the column.
-So the old column metadata cannot be used any more.
-By default the output column of the arithmetic operation will be given a
generic metadata (for example, its name will be @code{ARITH_1}, which is hardly
useful!).
-But metadata are critically important and it is good practice to always have
short, but descriptive, names for each columns, units and also some comments
for more explanation.
-To add metadata to a column, you can use the @option{--colmetadata} option
that is described in @ref{Invoking asttable} and @ref{Operation precedence in
Table}.
-
-Since the arithmetic expressions are a value to @option{--column}, it does not
necessarily have to be a separate option, so the commands above are also
identical to the command below (note that this only has one @option{-c} option).
-Just be very careful with the quoting!
-With the @option{--colmetadata} option, we are also giving a name, units and a
comment to the third column.
+@item --markshdu=STR/INT
+The HDU (or extension) name or number of the table containing mark properties
(file given to @option{--marks}).
+This is only relevant if the table is in the FITS format and there is more
than one HDU in the FITS file.
-@example
-$ asttable table.fits -cAWAV,SPECTRUM,'arith AWAV 1e10 x' \
- --colmetadata=3,AWAV_A,angstrom,"Wavelength (in Angstroms)"
-@end example
+@item -r STR,STR
+@itemx --markcoords=STR,STR
+The column names (or numbers) containing the coordinates of each mark (in
table given to @option{--marks}).
+Only two values should be given to this option (one for each coordinate).
+They can either be given to one call (@option{--markcoords=RA,DEC}) or in
separate calls (@option{--markcoords=RA --markcoords=DEC}).
-In case you need to append columns from other tables (with
@option{--catcolumnfile}), you can use those extra columns in column arithmetic
also.
-The easiest, and most robust, way is that your columns of interest (in all
files whose columns are to be merged) have different names.
-In this scenario, you can simply use the names of the columns you plan to
append.
-If there are similar names, note that by default Table appends a @code{-N} to
similar names (where @code{N} is the file counter given to
@option{--catcolumnfile}, see the description of @option{--catcolumnfile} for
more).
-Using column numbers can get complicated: if the number is smaller than the
main input's number of columns, the main input's column will be used.
-Otherwise (when the requested column number is larger than the main input's
number of columns), the final output (after appending all the columns from all
the possible files) column number will be used.
+When @option{--mode=image} the columns will be associated to the
horizontal/vertical coordinates of the image, and interpreted in units of
pixels.
+In @option{--mode=wcs}, the columns will be associated to the WCS coordinates
(typically Right Ascension and Declination, in units of degrees).
-Almost all the arithmetic operators of @ref{Arithmetic operators} are also
supported for column arithmetic in Table.
-In particular, the few that are not present in the Gnuastro
library@footnote{For a list of the Gnuastro library arithmetic operators,
please see the macros starting with @code{GAL_ARITHMETIC_OP} and ending with
the operator name in @ref{Arithmetic on datasets}.} are not yet supported for
column arithmetic.
-Besides the operators in @ref{Arithmetic operators}, several operators are
only available in Table to use on table columns.
+@item -O STR
+@itemx --mode=STR
+The coordinate mode for interpreting the values in the columns given to the
@option{--markcoord1} and @option{--markcoord2} options.
+The acceptable values are either @code{img} (for image or pixel coordinates),
and @code{wcs} for World Coordinate System (typically RA and Dec).
+For the WCS-mode, the input image should have the necessary WCS keywords,
otherwise ConvertType will crash.
+@item --markshape=STR/INT
+@cindex Shapes for marks (vector graphics)
+The column name(s), or number(s), containing the shapes of each mark (in table
given to @option{--marks}).
+The shapes can either be identified by their name, or their numerical
identifier.
+If identifying them by name in a plain-text table, you need to define a string
column (see @ref{Gnuastro text table format}).
+The full list of names is shown below, with their numerical identifier in
parenthesis afterwards.
+For each shape, you can also specify properties such as the size, line width,
rotation, and color.
+See the description of the relevant @option{--mark*} option below.
-@cindex WCS: World Coordinate System
-@cindex World Coordinate System (WCS)
@table @code
-@item wcs-to-img
-Convert the given WCS positions to image/dataset coordinates based on the
number of dimensions in the WCS structure of @option{--wcshdu} extension/HDU in
@option{--wcsfile}.
-It will output the same number of columns.
-The first popped operand is the last FITS dimension.
-For example, the two commands below (which have the same output) will produce
5 columns.
-The first three columns are the input table's ID, RA and Dec columns.
-The fourth and fifth columns will be the pixel positions in @file{image.fits}
that correspond to each RA and Dec.
+@item circle (1)
+A circular circumference.
+It's @emph{radius} is defined by a single size element (the first column given
to @option{--marksize}).
+Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
-@example
-$ asttable table.fits -cID,RA,DEC,'arith RA DEC wcs-to-img' \
- --wcsfile=image.fits
-$ asttable table.fits -cID,RA -cDEC \
- -c'arith RA DEC wcs-to-img' --wcsfile=image.fits
-@end example
+@item plus (2)
+The plus sign (@mymath{+}).
+The @emph{length of its lines} is defined by a single size element (the first
column given to @option{--marksize}).
+Such that the intersection of its lines is on the central coordinate of the
mark.
+Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
-@item img-to-wcs
-Similar to @code{wcs-to-img}, except that image/dataset coordinates are
converted to WCS coordinates.
+@item cross (3)
+A multiplication sign (@mymath{\times}).
+The @emph{length of its lines} is defined by a single size element (the first
column given to @option{--marksize}).
+Such that the intersection of its lines is on the central coordinate of the
mark.
+Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
-@item distance-flat
-Return the distance between two points assuming they are on a flat surface.
-Note that each point needs two coordinates, so this operator needs four
operands (currently it only works for 2D spaces).
-The first and second popped operands are considered to belong to one point and
the third and fourth popped operands to the second point.
+@item ellipse (4)
+An elliptical circumference.
+Its major axis radius is defined by the first size element (first column given
to @option{--marksize}), and its axis ratio is defined through the second size
element (second column given to @option{--marksize}).
-Each of the input points can be a single coordinate or a full table column
(containing many points).
-In other words, the following commands are all valid:
+@item point (5)
+A point (or a filled circle).
+Its @emph{radius} is defined by a single size element (the first column given
to @option{--marksize}).
+Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
-@example
-$ asttable table.fits \
- -c'arith X1 Y1 X2 Y2 distance-flat'
-$ asttable table.fits \
- -c'arith X Y 12.345 6.789 distance-flat'
-$ asttable table.fits \
- -c'arith 12.345 6.789 X Y distance-flat'
-@end example
+This filled circle mark is defined as a ``point'' because it is usually
relevant as a small size (or point in the whole image).
+But there is no limit on its size, so it can be arbitrarily large.
-In the first case we are assuming that @file{table.fits} has the following
four columns @code{X1}, @code{Y1}, @code{X2}, @code{Y2}.
-The returned column by this operator will be the difference between two points
in each row with coordinates like the following (@code{X1}, @code{Y1}) and
(@code{X2}, @code{Y2}).
-In other words, for each row, the distance between different points is
calculated.
-In the second and third cases (which are identical), it is assumed that
@file{table.fits} has the two columns @code{X} and @code{Y}.
-The returned column by this operator will be the difference of each row with
the fixed point at (12.345, 6.789).
+@item square (6)
+A square circumference.
+Its @emph{edge length} is defined by a single size element (the first column
given to @option{--marksize}).
+Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
-@item distance-on-sphere
-Return the spherical angular distance (along a great circle, in degrees)
between the given two points.
-Note that each point needs two coordinates (in degrees), so this operator
needs four operands.
-The first and second popped operands are considered to belong to one point and
the third and fourth popped operands to the second point.
+@item rectangle (7)
+A rectangular circumference.
+Its length along the horizontal image axis is defined by first size element
(first column given to @option{--marksize}), and its length along the vertical
image axis is defined through the second size element (second column given to
@option{--marksize}).
-Each of the input points can be a single coordinate or a full table column
(containing many points).
-In other words, the following commands are all valid:
+@item line (8)
+A line.
+The line's @emph{length} is defined by a single size element (the first column
given to @option{--marksize}.
+The line will be centered on the given coordinate.
+Like all shapes, you can rotate the line about its center using the
@option{--markrotate} column.
+Any value in the second size column (if given for other shapes in the same
call) are ignored by this shape.
-@example
-$ asttable table.fits \
- -c'arith RA1 DEC1 RA2 DEC2 distance-on-sphere'
-$ asttable table.fits \
- -c'arith RA DEC 9.876 5.432 distance-on-sphere'
-$ asttable table.fits \
- -c'arith 9.876 5.432 RA DEC distance-on-sphere'
-@end example
+@end table
-In the first case we are assuming that @file{table.fits} has the following
four columns @code{RA1}, @code{DEC1}, @code{RA2}, @code{DEC2}.
-The returned column by this operator will be the difference between two points
in each row with coordinates like the following (@code{RA1}, @code{DEC1}) and
(@code{RA2}, @code{DEC2}).
-In other words, for each row, the angular distance between different points is
calculated.
-In the second and third cases (which are identical), it is assumed that
@file{table.fits} has the two columns @code{RA} and @code{DEC}.
-The returned column by this operator will be the difference of each row with
the fixed point at (9.876, 5.432).
+@item --markrotate=STR/INT
+Column name or number that contains the mark's rotation angle.
+The rotation angle should be in degrees and be relative to the horizontal axis
of the image.
-The distance (along a great circle) on a sphere between two points is
calculated with the equation below, where @mymath{r_1}, @mymath{r_2},
@mymath{d_1} and @mymath{d_2} are the right ascensions and declinations of
points 1 and 2.
+@item --marksize=STR[,STR]
+The column name(s), or number(s), containing the size(s) of each mark (in
table given to @option{--marks}).
+All shapes need at least one ``size'' parameter and some need two.
+For the interpretation of the size column(s) for each shape, see the
@option{--markshape} option's description.
+Since the size column(s) is (are) optional, when not specified, default values
will be used (which may be too small in larger images, so you need to change
them).
-@dispmath {\cos(d)=\sin(d_1)\sin(d_2)+\cos(d_1)\cos(d_2)\cos(r_1-r_2)}
+By default, the values in the size column are assumed to be in the same units
as the coordinates (defined by the @option{--mode} option, described above).
+However, when the coordinates are in WCS-mode, some special cases may occur
for the size.
+@itemize
+@item
+The native WCS units (usually degrees) can be too large, and it may be more
convenient for the values in the size column(s) to be in arc-seconds.
+In this case, you can use the @option{--sizeinarcsec} option.
+@item
+Similar to above, but in units of arc-minutes.
+In this case, you can use the @option{--sizeinarcmin} option.
+@item
+Your sizes may be in units of pixels, not the WCS units.
+In this case, you can use the @option{--sizeinpix} option.
+@end itemize
-@item ra-to-degree
-Convert the hour-wise Right Ascension (RA) string, in the sexagesimal format
of @code{_h_m_s} or @code{_:_:_}, to degrees.
-Note that the input column has to have a string format.
-In FITS tables, string columns are well-defined.
-For plain-text tables, please follow the standards defined in @ref{Gnuastro
text table format}, otherwise the string column will not be read.
-@example
-$ asttable catalog.fits -c'arith RA ra-to-degree'
-$ asttable catalog.fits -c'arith $5 ra-to-degree'
-@end example
+@item --sizeinpix
+In WCS-mode, assume that the sizes are in units of pixels.
+By default, when in WCS-mode, the sizes are assumed to be in the units of the
WCS coordinates (usually degrees).
-@item dec-to-degree
-Convert the sexagesimal Declination (Dec) string, in the format of
@code{_d_m_s} or @code{_:_:_}, to degrees (a single floating point number).
-For more details please see the @option{ra-to-degree} operator.
+@item --sizeinarcsec
+In WCS-mode, assume that the sizes are in units of arc-seconds.
+By default, when in WCS-mode, the sizes are assumed to be in the units of the
WCS coordinates (usually degrees).
-@item degree-to-ra
-@cindex Sexagesimal
-@cindex Right Ascension
-Convert degrees (a column with a single floating point number) to the Right
Ascension, RA, string (in the sexagesimal format hours, minutes and seconds,
written as @code{_h_m_s}).
-The output will be a string column so no further mathematical operations can
be done on it.
-The output file can be in any format (for example, FITS or plain-text).
-If it is plain-text, the string column will be written following the standards
described in @ref{Gnuastro text table format}.
+@item --sizeinarcmin
+In WCS-mode, assume that the sizes are in units of arc-seconds.
+By default, when in WCS-mode, the sizes are assumed to be in the units of the
WCS coordinates (usually degrees).
-@item degree-to-dec
-@cindex Declination
-Convert degrees (a column with a single floating point number) to the
Declination, Dec, string (in the format of @code{_d_m_s}).
-See the @option{degree-to-ra} for more on the format of the output.
+@item --marklinewidth=STR/INT
+Column containing the width (thickness) of the line to draw each mark.
+The line width is measured in units of ``points'' (where 72 points is one
inch), and it can be any positive floating point number.
+Therefore, the thickness (in relation to the pixels of your image) depends on
@option{--widthincm} option.
+For more, see the description at the start of this section.
-@item date-to-sec
-@cindex Unix epoch time
-@cindex Time, Unix epoch
-@cindex Epoch, Unix time
-Return the number of seconds from the Unix epoch time (00:00:00 Thursday,
January 1st, 1970).
-The input (popped) operand should be a string column in the FITS date format
(most generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}).
+@item --markcolor=STR/INT
+Column containing the color of the mark.
+This column can be either a string or an integer.
+As a string, the color name can be written directly in your table (this
greatly helps in human readability).
+For more on string columns see @ref{Gnuastro text table format}.
+As an integer, you can simply use the numerical identifier of the column.
+You can see the list of colors with their names and numerical identifiers in
Gnuastro by running ConvertType with @option{--listcolors}, or see @ref{Vector
graphics colors}.
-The returned operand will be named @code{UNIXSEC} (short for Unix-seconds) and
will be a 64-bit, signed integer, see @ref{Numeric data types}.
-If the input string has sub-second precision, it will be ignored because
floating point numbers cannot accurately store numbers with many significant
digits.
-To preserve sub-second precision, please use @code{date-to-millisec}.
+@item --listcolors
+The list of acceptable color names, their codes and their representation can
be seen with the @option{--listcolors} option.
+By ``representation'' we mean that the color will be shown on the terminal as
the background in that column.
+But this will only be properly visible with ``true color'' or 24-bit
terminals, see @url{https://en.wikipedia.org/wiki/ANSI_escape_code,ANSI escape
sequence standard}.
+Most modern GNU/Linux terminals support 24-bit colors natively, and no
modification is necessary.
+For macOS, see the box below.
-For example, in the example below we are using this operator, in combination
with the @option{--keyvalue} option of the Fits program, to sort your desired
FITS files by observation date (value in the @code{DATE-OBS} keyword in example
below):
+The printed text in standard output is in the @ref{Gnuastro text table
format}, so if you want to store this table, you can simply pipe the output to
Gnuastro's Table program and store it as a FITS table:
@example
-$ astfits *.fits --keyvalue=DATE-OBS --colinfoinstdout \
- | asttable -cFILENAME,'arith DATE-OBS date-to-sec' \
- --colinfoinstdout \
- | asttable --sort=UNIXSEC
+$ astconvertt --listcolors | astttable -ocolors.fits
@end example
-If you do not need to see the Unix-seconds any more, you can add a
@option{-cFILENAME} (short for @option{--column=FILENAME}) at the end.
-For more on @option{--keyvalue}, see @ref{Keyword inspection and manipulation}.
+@cindex iTerm
+@cindex macOS terminal 24-bit color
+@cindex Color in macOS terminals
+@cartouche
+@noindent
+@strong{macOS terminal colors}: as of August 2022, the default macOS terminal
(iTerm) does not support 24-bit colors!
+The output of @option{--listlines} therefore does not display the actual
colors (you can only use the color names).
+One tested solution is to install and use @url{https://iterm2.com, iTerm2},
which is free software and available in
@url{https://formulae.brew.sh/cask/iterm2, Homebrew}.
+iTerm2 is described as a successor for iTerm and works on macOS 10.14
(released in September 2018) or newer.
+@end cartouche
-@item date-to-millisec
-Return the number of milli-seconds from the Unix epoch time (00:00:00
Thursday, January 1st, 1970).
-The input (popped) operand should be a string column in the FITS date format
(most generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}, where @code{.ddd} is the
optional sub-second component).
+@item --marktext=STR/INT
+Column name or number that contains the text that should be printed under the
mark.
+If the column is numeric, the number will be printed under the mark (for
example, if you want to write the magnitude or redshift of the object under the
mark showing it).
+For the precision of writing floating point columns, see
@option{--marktextprecision}.
+But if the column has a string format (for example, the name of the object
like an NGC1234), you need to define the column as a string column (see
@ref{Gnuastro text table format}).
-The returned operand will be named @code{UNIXMILLISEC} (short for Unix
milli-seconds) and will be a 64-bit, signed integer, see @ref{Numeric data
types}.
-The returned value is not a floating point type because for large numbers,
floating point data types loose single-digit precision (which is important
here).
+For text with different lengths, set the length in the definition of the
column to the maximum length of the strings to be printed.
+If there are some rows or marks that don't require text, set the string in
this column to @option{n/a} (not applicable; the blank value for strings in
Gnuastro).
+When having strings with different lengths, make sure to have enough white
spaces (for the shorter strings) so the adjacent columns are not taken as part
of the string (see @ref{Gnuastro text table format}).
-Other than the units of the output, this operator behaves similarly to
@code{date-to-sec}.
-See the description of that operator for an example.
+@item --marktextprecision=INT
+The number of decimal digits to print after the floating point.
+This is only relevant when @option{--marktext} is given, and the selected
column has a floating point format.
-@item sorted-to-interval
-Given a single column (which must be already sorted and have a numeric data
type), return two columns: the first returned column is the minimum and the
second returned column is the maximum value of the interval of each row row.
-The maximum of each row is equal to the minimum of the previous row; thus
creating a contiguous interval coverage of the input column's range in all rows.
+@item --markfont=STR/INT
+@cindex Fonts
+@cindex Ghostscript fonts
+Column name or number that contains the font for the displayed text under the
mark.
+This is only relevant if @option{--marktext} is called.
+The font should be accessible by Ghostscript.
-The minimum value of the first row and maximum of the last row will be
smaller/larger than the respective row of the input (based on the distance to
the next/previous element).
-This is done to ensure that if your input has a fixed interval length between
all elements, the first and last intervals also have that fixed length.
+If you are not familiar with the available fonts on your system's Ghostscript,
you can use the @option{--showfonts} option to see all the fonts in a custom
PDF file (one page per font).
+If you are already familiar with the font you want, but just want to make sure
about its presence (or spelling!), you can get a list (on standard output) of
all the available fonts with the @option{--listfonts} option.
+Both are described below.
-For example, with the command below, we'll use this operator on a hypothetical
radial profile.
-Note how the intervals are contiguous even though the radius values are not
equally distant (if the row with a radius of 2.5 didn't exist, the intervals
would all be the same length).
-For another example of the usage of this operator, see the example in the
description of @option{--customtable} in @ref{MakeProfiles profile settings}.
+@cindex Adding Ghostscript fonts
+It is possible to add custom fonts to Ghostscript as described in the
@url{https://ghostscript.com/doc/current/Fonts.htm, Fonts section} of the
Ghostscript manual.
-@example
-$ cat radial-profile.txt
-# Column 1: RADIUS [pix,f32,] Distance to center in pixels.
-# Column 2: MEAN [ADU,f32,] Mean value at that radius.
-0 100
-1 40
-2 30
-2.5 25
-3 20
+@item --markfontsize=STR/INT
+Column name or number that contains the font size to use.
+This is only relevant if a text column has been defined (with
@option{--marktext}, described above).
+The font size is in units of ``point''s, see description at the start of this
section for more.
-$ asttable radial-profile.txt --txtf32f=fixed --txtf32p=3 \
- -c'arith RADIUS sorted-to-interval',MEAN
--0.500 0.500 100.000
-0.500 1.500 40.000
-1.500 2.250 30.000
-2.250 2.750 25.000
-2.750 3.250 20.000
-@end example
+@item --showfonts
+Create a special PDF file that shows the name and shape of all available fonts
in your system's Ghostscript.
+You can use this for selecting the best font to put in the
@option{--markfonts} column.
+The available fonts can differ from one system to another (depending on how
Ghostscript was configured in that system).
+The PDF file's name is constructed by appending a @file{-fonts.pdf} to the
file name given to the @option{--output} option.
-Such intervals can be useful in scenarios like generating the input to
@option{--customtable} in MakeProfiles (see @ref{MakeProfiles profile
settings}) from a radial profile (see @ref{Generate radial profile}).
+The PDF file will have one page for each font, and the sizes of the pages are
customized for showing the fonts (each page is horizontally elongated).
+This helps to better check the files by disable ``continuous'' mode in your
PDF viewer, and setting the zoom such that the width of the page corresponds to
the width of your PDF viewer.
+Simply pressing the left/right keys will then nicely show each fonts
separately.
+
+@item --listfonts
+Print (to standard output) the names of all available fonts in Ghostscript
that you can use for the @option{--markfonts} column.
+The available fonts can differ from one system to another (depending on how
Ghostscript was configured in that system).
+If you are not already familiar with the shape of each font, please use
@option{--showfonts} (described above).
@end table
-@node Operation precedence in Table, Invoking asttable, Column arithmetic,
Table
-@subsection Operation precedence in Table
-The Table program can do many operations on the rows and columns of the input
tables and they are not always applied in the order you call the operation on
the command-line.
-In this section we will describe which operation is done before/after which
operation.
-Knowing this precedence table is important to avoid confusion when you ask for
more than one operation.
-For a description of each option, please see @ref{Invoking asttable}.
-By default, column-based operations will be done first.
-You can ask for switching to row-based operations to be done first, using the
@option{--rowfirst} option.
-@cartouche
-@noindent
-@strong{Pipes for different precedence:} It may happen that your desired
series of operations cannot be done with the precedence mentioned below (in one
command).
-In this case, you can pipe the output of one call to @command{asttable} to
another @command{asttable}.
-Just don't forget to give @option{-O} (or @option{--colinfoinstdout}) to the
first instance (so the column metadata are also passed to the next instance).
-Without metadata, all numbers will be read as double-precision (see
@ref{Gnuastro text table format}; recall that piping is done in plain text
format), vector columns will be broken into single-valued columns, and column
names, units and comments will be lost.
-At the end of this section, there is an example of doing this.
-@end cartouche
-@table @asis
-@item Input table information
-The first set of operations that will be preformed (if requested) are the
printing of the input table information.
-Therefore, when the following options are called, the column data are not read
at all.
-Table simply reads the main input's column metadata (name, units, numeric data
type and comments), and the number of rows and prints them.
-Table then terminates and no other operation is done.
-These can therefore be called at the end of an arbitrarily long Table command.
-When you have forgot some information about the input table.
-You can then delete these options and continue writing the command (using the
shell's history to retrieve the previous command with an up-arrow key).
-At any time only a single one of the options in this category may be called.
-The order of checking for these options is therefore important: in the same
order that they are described below:
-@table @asis
-@item Column and row information (@option{--information} or @option{-i})
-Print the list of input columns and the metadata of each column in a single
row.
-This includes the column name, numeric data type, units and comments of each
column within a separate row of the output.
-Finally, print the number of rows.
-@item Number of columns (@option{--info-num-cols})
-Print the number of columns in the input table.
-Only a single integer (number of columns) is printed before Table terminates.
-@item Number of rows (@option{--info-num-rows})
-Print the number of rows in the input table.
-Only a single integer (number of rows) is printed before Table terminates.
-@end table
-@item Column selection (@option{--column})
-When this option is given, only the columns given to this option (from the
main input) will be used for all future steps.
-When @option{--column} (or @option{-c}) is not given, then all the main
input's columns will be used in the next steps.
-@item Column-based operations
-By default the following column-based operations will be done before the
row-based operations in the next item.
-If you need to give precedence to row-based operations, use
@option{--rowfirst}.
+@node Table, Query, ConvertType, Data containers
+@section Table
-@table @asis
+Tables are the high-level products of processing on low-leveler data like
images or spectra.
+For example, in Gnuastro, MakeCatalog will process the pixels over an object
and produce a catalog (or table) with the properties of each object such as
magnitudes and positions (see @ref{MakeCatalog}).
+Each one of these properties is a column in its output catalog (or table) and
for each input object, we have a row.
-@item Column(s) from other file(s): @option{--catcolumnfile}
-When column concatenation (addition) is requested, columns from other tables
(in other files, or other HDUs of the same FITS file) will be added after the
existing columns are read from the main input.
-In one command, you can call @option{--catcolumnfile} multiple times to allow
addition of columns from many files.
+When there are only a small number of objects (rows) and not too many
properties (columns), then a simple plain text file is mainly enough to store,
transfer, or even use the produced data.
+However, to be more efficient, astronomers have defined the FITS binary table
standard to store data in a binary format (which cannot be seen in a text
editor text).
+This can offer major advantages: the file size will be greatly reduced and the
reading and writing will also be faster (because the RAM and CPU also work in
binary).
+The acceptable table formats are fully described in @ref{Tables}.
-Therefore you can merge the columns of various tables into one table in this
step (at the start), then start adding/limiting the rows, or building vector
columns, .
-If any of the row-based operations below are requested in the same
@code{asttable} command, they will also be applied to the rows of the added
columns.
-However, the conditions to keep/reject rows can only be applied to the rows of
the columns in main input table (not the columns that are added with these
options).
+@cindex AWK
+@cindex GNU AWK
+Binary tables are not easily readable with basic plain-text editors.
+There is no fixed/unified standard on how the zero and ones should be
interpreted.
+Unix-like operating systems have flourished because of a simple fact:
communication between the various tools is based on human readable
characters@footnote{In ``The art of Unix programming'', Eric Raymond makes this
suggestion to programmers: ``When you feel the urge to design a complex binary
file format, or a complex binary application protocol, it is generally wise to
lie down until the feeling passes.''.
+This is a great book and strongly recommended, give it a look if you want to
truly enjoy your work/life in this environment.}.
+So while the FITS table standards are very beneficial for the tools that
recognize them, they are hard to use in the vast majority of available software.
+This creates limitations for their generic use.
-@item Extracting single-valued columns from vectors (@option{--fromvector})
-Once all the input columns are read into memory, if any of them are vectors,
you can extract a single-valued column from the vector columns at this stage.
-For more on vector columns, see @ref{Vector columns}.
+Table is Gnuastro's solution to this problem.
+Table has a large set of operations that you can directly do on any recognized
table (such as selecting certain rows and doing arithmetic on the columns).
+For operations that Table does not do internally, FITS tables (ASCII or
binary) are directly accessible to the users of Unix-like operating systems (in
particular those working the command-line or shell, see @ref{Command-line
interface}).
+With Table, a FITS table (in binary or ASCII formats) is only one command away
from AWK (or any other tool you want to use).
+Just like a plain text file that you read with the @command{cat} command.
+You can pipe the output of Table into any other tool for higher-level
processing, see the examples in @ref{Invoking asttable} for some simple
examples.
-@item Creating vector columns (@option{--tovector})
-After column arithmetic, there is no other way to add new columns so the
@option{--tovector} operator is applied at this stage.
-You can use it to merge multiple columns that are available in this stage to a
single vector column.
-For more, see @ref{Vector columns}.
+In the sections below we describe how to effectively use the Table program.
+We start with @ref{Column arithmetic}, where the basic concept and methods of
applying arithmetic operations on one or more columns are discussed.
+Afterwards, in @ref{Operation precedence in Table}, we review the various
types of operations available and their precedence in an instance of calling
Table.
+This is a good place to get a general feeling of all the things you can do
with Table.
+Finally, in @ref{Invoking asttable}, we give some examples and describe each
option in Table.
-@item Column arithmetic
-Once the final rows are selected in the requested order, column arithmetic is
done (if requested).
-For more on column arithmetic, see @ref{Column arithmetic}.
+@menu
+* Printing floating point numbers:: Optimal storage of floating point types.
+* Vector columns:: How to keep more than one value in each column.
+* Column arithmetic:: How to do operations on table columns.
+* Operation precedence in Table:: Order of running options in Table.
+* Invoking asttable:: Options and arguments to Table.
+@end menu
-@end table
+@node Printing floating point numbers, Vector columns, Table, Table
+@subsection Printing floating point numbers
+@cindex Floating point numbers
+@cindex Printing floating point numbers
+Many of the columns containing astronomical data will contain floating point
numbers (those that aren't an integer, like @mymath{1.23} or
@mymath{4.56\times10^{-7}}).
+However, printing (for human readability) of floating point numbers has some
intricacies that we will explain in this section.
+For a basic introduction to different types of integers or floating points,
see @ref{Numeric data types}.
-@item Row-based operations
-Row-based operations only work within the rows of existing columns when they
are activated.
-By default row-based operations are activated after column-based operations
(which are mentioned above).
-If you need to give precedence to row-based operations, use
@option{--rowfirst}.
+It may be tempting to simply use 64-bit floating points all the time and avoid
this section over all.
+But have in mind that compared to 32-bit floating point type, a 64-bit
floating point type will consume double the storage, double the RAM and will
take almost double the time for processing.
+So when the statistical precision of your numbers is less than that offered by
32-bit floating point precision, it is much better to store them in this format.
-@table @asis
-@item Rows from other file(s) (@option{--catrowfile})
-With this feature, you can import rows from other tables (in other files, or
other HDUs of the same FITS file).
-The same column selection of @option{--column} is applied to the tables given
to this option.
-The column metadata (name, units and comments) will be taken from the main
input.
-Two conditions are mandatory for adding rows:
-@itemize
-@item
-The number of columns used from the new tables must be equal to the number of
columns in memory, by the time control reaches here.
-@item
-The data type of each column (see @ref{Numeric data types}) should be the same
as the respective column in memory by the time control reaches here.
-If the data types are different, you can use the type conversion operators of
column arithmetic which has higher precedence (and will therefore be applied
before this by default).
-For more on type conversion, see @ref{Numerical type conversion operators} and
@ref{Column arithmetic}).
-@end itemize
+Within almost all commonly used CPUs of today, numbers (including integers or
floating points) are stored in binary base-2 format (where the only digits that
can be used to represent the number are 0 and 1).
+However, we (humans) are use to numbers in base-10 (where we have 10 digits:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9).
+For integers, there is a one-to-one correspondence between a base-2 and
base-10 representation.
+Therefore, converting a base-10 integer (that you will be giving as an option
value when running a Gnuastro program, for example) to base-2 (that the
computer will store in memory), or vice-versa, will not cause any loss of
information for integers.
-@item Row selection by value in a column
-The following operations select rows based on the values in them.
-A more complete description of each of these options is given in @ref{Invoking
asttable}.
+The problem is that floating point numbers don't have such a one-to-one
correspondence between the two notations.
+The full discussion on how floating point numbers are stored in binary format
is beyond the scope of this book.
+But please have a look at the corresponding
@url{https://en.wikipedia.org/wiki/Floating-point_arithmetic, Wikipedia
article} to get a rough feeling about the complexity.
+Of course, if you are interested in the details, that Wikipedia article should
be a good starting point for further reading.
-@itemize
-@item
-@option{--range}: only keep rows where the value in the given column is within
a certain interval.
-@item
-@option{--inpolygon}: only keep rows where the value is within the polygon of
@option{--polygon}.
-@item
-@option{--outpolygon}: only keep rows outside the polygon of
@option{--polygon}.
-@item
-@option{--equal}: only keep rows with an specified value in given column.
-@item
-@option{--notequal}: only keep rows without specified value in given column.
-@item
-@option{--noblank}: only keep rows that are not blank in the given column(s).
-@end itemize
+@cindex IEEE 754 (floating point)
+The most common convention for storing floating point numbers in digital
storage is IEEE Standard for Floating-Point Arithmetic;
@url{https://en.wikipedia.org/wiki/IEEE_754, IEEE 754}.
+In short, the full width (in bits) assigned to that type (for example the 32
bits allocated for 32-bit floating point types) is divided into separate
components: The first bit is the ``sign'' (specifying if the number is negative
or positive).
+In 32-bit floats, the next 8 bits are the ``exponent'' and finally (again, in
32-bit floats), the ``fraction'' is stored in the next 23 bits.
+For example see
@url{https://commons.wikimedia.org/wiki/File:Float_example.svg, this image on
Wikipedia}.
-These options can be called any number of times (to limit the final rows based
on values in different columns for example).
-Since these are row-rejection operations, their internal order is irrelevant.
-In other words, it makes no difference if @option{--equal} is called before or
after @option{--range} for example.
+@cindex Decimal digits
+@cindex Precision of floats
+In IEEE 754, around zero, the base-2 and base-10 representations approximately
match.
+However, as we go away from 0, you will loose precision.
+The important concept in understanding the precision of floating point numbers
is ``decimal digits'', or the number of digits in the number, independent of
where the decimal point is.
+For example @mymath{1.23} has three decimal digits and
@mymath{4.5678\times10^9} has 5 decimal digits.
+According to IEEE
754@footnote{@url{https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats}},
32-bit and 64-bit floating point numbers can accurately (statistically)
represent a floating point with 7.22 and 15.95 decimal digits respectively.
-As a side-effect, because NaN/blank values are defined to fail on any
condition, these operations will also remove rows with NaN/blank values in the
specified column they are checking.
-Also, the columns that are used for these operations do not necessarily have
to be in the final output table (you may not need the column after doing the
selection based on it).
+@cartouche
+@noindent
+@strong{Should I store my columns as 32-bit or 64-bit floating point type?} If
your floating point numbers have 7 decimal digits or less (for example noisy
image pixel values, measured star or galaxy magnitudes, and anything that is
derived from them like galaxy mass and etc), you can safely use 32-bit
precision (the statistical error on the measurements is usually significantly
larger than 7 digits!).
+However, some columns require more digits; thus 64-bit precision.
+For example, RA or Dec with more than one arcsecond accuracy: the degrees can
have 3 digits, and 1 arcsecond is @mymath{1/3600\sim0.0003} of a degree,
requiring 4 more digits).
+You can use the @ref{Numerical type conversion operators} of @ref{Column
arithmetic} to convert your columns to a certain type for storage.
+@end cartouche
-By default, these options are applied after merging columns from other tables.
-However, currently, the column given to these options can only come from the
main input table.
-If you need to apply these operations on columns from
@option{--catcolumnfile}, pipe the output of one instance of Table with
@option{--catcolumnfile} into another instance of Table as suggested in the box
above this list.
+The discussion above was for the storage of floating point numbers.
+When printing floating point numbers in a human-friendly format (for example,
in a plain-text file or on standard output in the command-line), the computer
has to convert its internal base-2 representation to a base-10 representation.
+This second conversion may cause a small discrepancy between the stored and
printed values.
-These row-based operations options are applied first because the speed of
later operations can be greatly affected by the number of rows.
-For example, if you also call the @option{--sort} option, and your row
selection will result in 50 rows (from an input of 10000 rows), limiting the
number of rows first will greatly speed up the sorting in your final output.
+@cartouche
+@noindent
+@strong{Use FITS tables as output of measurement programs:} When you are doing
a measurement to produce a catalog (for example with @ref{MakeCatalog}) set the
output to be a FITS table (for example @option{--output=mycatalog.fits}).
+A FITS binary table will store the same the base-2 number that was measured by
the CPU.
+However, if you choose to store the output table as a plain-text table, you
risk loosing information due to the human friendly base-10 floating point
conversion (which is necessary in a plain-text output).
+@end cartouche
-@item Sorting (@option{--sort})
-Sort of the rows based on values in a certain column.
-The column to sort by can only come from the main input table columns (not
columns that may have been added with @option{--catcolumnfile}).
+To customize how columns containing floating point values are printed (in a
plain-text output file, or in the standard output in your terminal), Table has
four options for the two different types: @option{--txtf32format},
@option{--txtf32precision}, @option{--txtf64format} and
@option{--txtf64precision}.
+They are fully described in @ref{Invoking asttable}.
-@item Row selection (by position)
-@itemize
-@item
-@option{--head}: keep only requested number of top rows.
-@item
-@option{--tail}: keep only requested number of bottom rows.
-@item
-@option{--rowrandom}: keep only a random number of rows.
-@item
-@option{--rowrange}: keep only rows within a certain positional interval.
-@end itemize
-
-These options limit/select rows based on their position within the table (not
their value in any certain column).
-
-@item Transpose vector columns (@option{--transpose})
-Transposing vector columns will not affect the number or metadata of columns,
it will just re-arrange them in their 2D structure.
-As a result, after transposing, the number of rows changes, as well as the
number of elements in each vector column.
-See the description of this option in @ref{Invoking asttable} for more (with
an example).
-@end table
-
-@item Column metadata (@option{--colmetadata})
-Once the structure of the final table is set, you can set the column metadata
just before finishing.
+@cartouche
+@noindent
+@strong{Summary:} it is therefore recommended to always store your tables as
FITS (binary) tables.
+To view the contents of the table on the command-line or to feed it to a
program that doesn't recognize FITS tables, you can use the four options above
for a custom base-10 conversion that will not cause any loss of data.
+@end cartouche
-@item Output row selection (@option{--noblankend})
-Only keep the output rows that do not have a blank value in the given
column(s).
-For example, you may need to apply arithmetic operations on the columns
(through @ref{Column arithmetic}) before rejecting the undesired rows.
-After the arithmetic operation is done, you can use the @code{where} operator
to set the non-desired columns to NaN/blank and use @option{--noblankend}
option to remove them just before writing the output.
-In other scenarios, you may want to remove blank values based on columns in
another table.
-To help in readability, you can also use the final column names that you set
with @option{--colmetadata}!
-See the example below for applying any generic value-based row selection based
on @option{--noblankend}.
-@end table
+@node Vector columns, Column arithmetic, Printing floating point numbers, Table
+@subsection Vector columns
-As an example, let's review how Table interprets the command below.
-We are assuming that @file{table.fits} contains at least three columns:
@code{RA}, @code{DEC} and @code{PARAM} and you only want the RA and Dec of the
rows where @mymath{p\times 2<5} (@mymath{p} is the value of each row in the
@code{PARAM} column).
+@cindex Vector columns
+@cindex Columns (Vector)
+@cindex Multi-value columns (vector)
+In its most common format, each column of a table only has a single value in
each row.
+For example, we usually have one column for the magnitude, another column for
the RA (Right Ascension) and yet another column for the DEC (Declination) of a
set of galaxies/stars (where each galaxy is represented by one row in the
table).
+This common single-valued column format is sufficient in many scenarios.
+However, in some situations (like those below) it would help to have multiple
values for each row in each column, not just one.
-@example
-$ asttable table.fits -cRA,DEC --noblankend=MULTIP \
- -c'arith PARAM 2 x set-i i i 5 gt nan where' \
- --colmetadata=3,MULTIP,unit,"Description of column"
-@end example
+@itemize
+@item
+@cindex MUSE
+@cindex Spectrum
+@cindex Radial profile
+Conceptually: the various numbers are ``connected'' to each other.
+In other words, their order and position in relation to each other matters.
+Common examples in astronomy are the radial profiles of each galaxy in your
catalog, or their spectrum.
+For example, each
MUSE@footnote{@url{https://www.eso.org/sci/facilities/develop/instruments/muse.html}}
spectra has 3681 points (with a sampling of of 1.25 Angstroms).
-@noindent
-Due to the precedence described in this section, Table does these operations
(which are independent of the order of the operations written on the
command-line):
+Dealing with this many separate measurements as separate columns in your table
is very annoying and prone to error: you don't want to forget moving some of
them in an output table for further analysis, mistakenly change their order, or
do some operation only on a sub-set of them.
-@enumerate
-@item
-At the start (with @code{-cRA,DEC}), Table reads the @code{RA} and @code{DEC}
columns.
-@item
-In between all the operations in the command above, Column arithmetic (with
@option{-c'arith ...'}) has the highest precedence.
-So the arithmetic operation is done and stored as a new (third) column.
-In this arithmetic operation, we multiply all the values of the @code{PARAM}
column by 2, then set all those with a value larger than 5 to NaN (for more on
understanding this operation, see the `@code{set-}' and `@code{where}'
operators in @ref{Arithmetic operators}).
-@item
-Updating column metadata (with @option{--colmetadata}) is then done to give a
name (@code{MULTIP}) to the newly calculated (third) column.
-During the process, besides a name, we also set a unit and description for the
new column.
-These metadata entries are @emph{very important}, so always be sure to add
metadata after doing column arithmetic.
-@item
-The lowest precedence operation is @option{--noblankend=MULTIP}.
-So only rows that are not blank/NaN in the @code{MULTIP} column are kept.
@item
-Finally, the output table (with three columns) is written to the command-line.
-If you also want to print the column metadata, you can use the @option{-O} (or
@option{--colinfoinstdout}) option.
-Alternatively, if you want the output in a file, you can use the
@option{--output} option to save the table in FITS or plain-text format.
-@end enumerate
+Technically: in the FITS standard, you can only store a maximum of 999 columns
in a FITS table.
+Therefore, if you have more than 999 data points for each galaxy (like the
MUSE spectra example above), it is impossible to store each point in one table
as separate columns.
+@end itemize
-It may happen that your desired operation needs a separate precedence.
-In this case you can pipe the output of Table into another call of Table and
use the @option{-O} (or @option{--colinfoinstdout}) option to preserve the
metadata between the two calls.
+To address these problems, the FITS standard has defined the concept of
``vector'' columns in its Binary table format (ASCII FITS tables don't support
vector columns, but Gnuastro's plain-text format does, as described here).
+Within each row of a single vector column, we can store any number of data
points (like the MUSE spectra above or the full radial profile of each galaxy).
+All the values in a vector column have to have the same @ref{Numeric data
types}, and the number of elements within each vector column is the same for
all rows.
-For example, let's assume that you want to sort the output table from the
example command above based on the new @code{MULTIP} column.
-Since sorting is done prior to column arithmetic, you cannot do it in one
command, but you can circumvent this limitation by simply piping the output
(including metadata) to another call to Table:
+By grouping conceptually similar data points (like a spectrum) in one vector
column, we can significantly reduce the number of columns and make it much more
manageable, without loosing any information!
+To demonstrate the vector column features of Gnuastro's Table program, let's
start with a randomly generated small (5 rows and 3 columns) catalog.
+This will allows us to show the outputs of each step here, but you can apply
the same concept to vectors with any number of columns.
+
+With the command below, we use @code{seq} to generate a single-column table
that is piped to Gnuastro's Table program.
+Table then uses column arithmetic to generate three columns with random values
from that column (for more, see @ref{Column arithmetic}).
+Each column becomes noisy, with standard deviations of 2, 5 and 10.
+Finally, we will add metadata to each column, giving each a different name
(using names is always the best way to work with columns):
@example
-asttable table.fits -cRA,DEC --noblankend=MULTIP --colinfoinstdout \
- -c'arith PARAM 2 x set-i i i 5 gt nan where' \
- --colmetadata=3,MULTIP,unit,"Description of column" \
- | asttable --sort=MULTIP --output=selected.fits
+$ seq 1 5 \
+ | asttable -c'arith $1 2 mknoise-sigma f32' \
+ -c'arith $1 5 mknoise-sigma f32' \
+ -c'arith $1 10 mknoise-sigma f32' \
+ --colmetadata=1,abc,none,"First column." \
+ --colmetadata=2,def,none,"Second column." \
+ --colmetadata=3,ghi,none,"Third column." \
+ --output=table.fits
@end example
-@node Invoking asttable, , Operation precedence in Table, Table
-@subsection Invoking Table
-
-Table will read/write, select, modify, or show the information of the rows and
columns in recognized Table formats (including FITS binary, FITS ASCII, and
plain text table files, see @ref{Tables}).
-Output columns can also be determined by number or regular expression matching
of column names, units, or comments.
-The executable name is @file{asttable} with the following general template
+With the command below, let's have a look at the table.
+When you run it, you will have a different random number generator seed, so
the numbers will be slightly different.
+For making reproducible random numbers, see @ref{Generating random numbers}.
+The @option{-Y} option is used for more easily readable numbers (without it,
floating point numbers are written in scientific notation, for more see
@ref{Printing floating point numbers}) and with the @option{-O} we are asking
Table to also print the metadata.
+For more on Table's options, see @ref{Invoking asttable} and for seeing how
the short options can be merged (such that @option{-Y -O} is identical to
@option{-YO}), see @ref{Options}.
@example
-$ asttable [OPTION...] InputFile
+$ asttable table.fits -YO
+# Column 1: abc [none,f32,] First column.
+# Column 2: def [none,f32,] Second column.
+# Column 3: ghi [none,f32,] Third column.
+1.074 5.535 -4.464
+0.606 -2.011 15.397
+1.475 1.811 5.687
+2.248 7.663 -7.789
+6.355 17.374 6.767
@end example
-@noindent
-One line examples:
+We see that indeed, it has three columns, with our given names.
+Now, let's assume that you want to make a two-element vector column from the
values in the @code{def} and @code{ghi} columns.
+To do that, you can use the @option{--tovector} option like below.
+As the name suggests, @option{--tovector} will merge the rows of the two
columns into one vector column with multiple values in each row.
@example
-## Get the table column information (name, data type, or units):
-$ asttable table.fits --information
-
-## Print columns named RA and DEC, followed by all the columns where
-## the name starts with "MAG_":
-$ asttable table.fits --column=RA --column=DEC --column=/^MAG_/
+$ asttable table.fits -YO --tovector=def,ghi
+# Column 1: abc [none,f32 ,] First column.
+# Column 2: def-VECTOR [none,f32(2),] Vector by merging multiple cols.
+1.074 5.535 -4.464
+0.606 -2.011 15.397
+1.475 1.811 5.687
+2.248 7.663 -7.789
+6.355 17.374 6.767
+@end example
-## Similar to the above, but with one call to `--column' (or `-c'),
-## also sort the rows by the input's photometric redshift (`Z_PHOT')
-## column. To confirm the sort, you can add `Z_PHOT' to the columns
-## to print.
-$ asttable table.fits -cRA,DEC,/^MAG_/ --sort=Z_PHOT
+@cindex Tokens
+If you ignore the metadata, this doesn't seem to have changed anything!
+You see that each line of numbers still has three ``tokens'' (to distinguish
them from ``columns'').
+But once you look at the metadata, you only see metadata for two columns, not
three.
+If you look closely, the numeric data type of the newly added fourth column is
`@code{f32(2)}' (look above; previously it was @code{f32}).
+The @code{(2)} shows that the second column contains two numbers/tokens not
one.
+If your vector column consisted of 3681 numbers, this would be
@code{f32(3681)}.
+Looking again at the metadata, we see that @option{--tovector} has also
created a new name and comments for the new column.
+This is done all the time to avoid confusion with the old columns.
-## Similar to the above, but only print rows that have a photometric
-## redshift between 2 and 3.
-$ asttable table.fits -cRA,DEC,/^MAG_/ --range=Z_PHOT,2:3
+Let's confirm that the newly added column is indeed a single column but with
two values.
+To do this, with the command below, we'll write the output into a FITS table.
+In the same command, let's also give a more suitable name for the new
merged/vector column).
+We can get a first confirmation by looking at the table's metadata in the
second command below:
-## Only print rows with a value in the 10th column above 100000:
-$ asttable table.txt --range=10,10e5,inf
+@example
+$ asttable table.fits -YO --tovector=def,ghi --output=vec.fits \
+ --colmetadata=2,vector,nounits,"New vector column."
-## Only print the 2nd column, and the third column multiplied by 5,
-## Save the resulting two columns in `table.txt'
-$ asttable table.fits -c2,'arith $2 5 x' -otable.fits
+$ asttable vec.fits -i
+--------
+vec.fits (hdu: 1)
+------- ----- ---- -------
+No.Name Units Type Comment
+------- ----- ---- -------
+1 abc none float32 First column.
+2 vector nounits float32(2) New vector column.
+--------
+Number of rows: 5
+--------
+@end example
-## Sort the output columns by the third column, save output:
-$ asttable table.fits --sort=3 -ooutput.txt
+@noindent
+A more robust confirmation would be to print the values in the newly added
@code{vector} column.
+As expected, asking for a single column with @option{--column} (or
@option{-c}) will given us two numbers per row/line (instead of one!).
-## Subtract the first column from the second in `cat.txt' (can also
-## be a FITS table) and keep the third and fourth columns.
-$ asttable cat.txt -c'arith $2 $1 -',3,4 -ocat.fits
+@example
+$ asttable vec.fits -c vector -YO
+# Column 1: vector [nounits,f32(2),] New vector column.
+ 5.535 -4.464
+-2.011 15.397
+ 1.811 5.687
+ 7.663 -7.789
+ 17.374 6.767
+@end example
-## Convert sexagesimal coordinates to degrees (same can be done in a
-## large table given as argument).
-$ echo "7h34m35.5498 31d53m14.352s" | asttable
+If you want to keep the original single-valued columns that went into the
vector column, you can use the @code{--keepvectfin} option (read it as ``KEEP
VECtor To/From Inputs''):
-## Convert RA and Dec in degrees to sexagesimal (same can be done in a
-## large table given as argument).
-echo "113.64812416667 31.88732" \
- | asttable -c'arith $1 degree-to-ra $2 degree-to-dec'
+@example
+$ asttable table.fits -YO --tovector=def,ghi --keepvectfin \
+ --colmetadata=4,vector,nounits,"New vector column."
+# Column 1: abc [none ,f32 ,] First column.
+# Column 2: def [none ,f32 ,] Second column.
+# Column 3: ghi [none ,f32 ,] Third column.
+# Column 4: vector [nounits,f32(2),] New vector column.
+1.074 5.535 -4.464 5.535 -4.464
+0.606 -2.011 15.397 -2.011 15.397
+1.475 1.811 5.687 1.811 5.687
+2.248 7.663 -7.789 7.663 -7.789
+6.355 17.374 6.767 17.374 6.767
@end example
-Table's input dataset can be given either as a file or from Standard input
(piped from another program, see @ref{Standard input}).
-In the absence of selected columns, all the input's columns and rows will be
written to the output.
-The full set of operations Table can do are described in detail below, but for
a more high-level introduction to the various operations, and their precedence,
see @ref{Operation precedence in Table}.
+Now that you know how to create vector columns, let's assume you have the
inverse scenario: you want to extract one of the values of a vector column into
a separate single-valued column.
+To do this, you can use the @option{--fromvector} option.
+The @option{--fromvector} option takes the name (or counter) of a vector
column, followed by any number of integer counters (counting from 1).
+It will extract those elements into separate single-valued columns.
+For example, let's assume you want to extract the second element of the
@code{defghi} column in the file you made before:
-If any output file is explicitly requested (with @option{--output}) the output
table will be written in it.
-When no output file is explicitly requested the output table will be written
to the standard output.
-If the specified output is a FITS file, the type of FITS table (binary or
ASCII) will be determined from the @option{--tabletype} option.
-If the output is not a FITS file, it will be printed as a plain text table
(with space characters between the columns).
-When the output is not binary (for example standard output or a plain-text),
the @option{--txtf32*} or @option{--txtf64*} options can be used for the
formatting of floating point columns (see @ref{Printing floating point
numbers}).
-When the columns are accompanied by meta-data (like column name, units, or
comments), this information will also printed in the plain text file before the
table, as described in @ref{Gnuastro text table format}.
+@example
+$ asttable vec.fits --fromvector=vector,2 -YO
+# Column 1: abc [none ,f32,] First column.
+# Column 2: vector-2 [nounits,f32,] New vector column.
+1.074 -4.464
+0.606 15.397
+1.475 5.687
+2.248 -7.789
+6.355 6.767
+@end example
-For the full list of options common to all Gnuastro programs please see
@ref{Common options}.
-Options can also be stored in directory, user or system-wide configuration
files to avoid repeating on the command-line, see @ref{Configuration files}.
-Table does not follow Automatic output that is common in most Gnuastro
programs, see @ref{Automatic output}.
-Thus, in the absence of an output file, the selected columns will be printed
on the command-line with no column information, ready for redirecting to other
tools like @command{awk}.
+@noindent
+Just like the case with @option{--tovector} above, if you want to keep the
input vector column, use @option{--keepvectfin}.
+This feature is useful in scenarios where you want to select some rows based
on a single element (or multiple) of the vector column.
@cartouche
@noindent
-@strong{Sexagesimal coordinates as floats in plain-text tables:}
-When a column is determined to be a floating point type (32-bit or 64-bit) in
a plain-text table, it can contain sexagesimal values in the format of
`@code{_h_m_s}' (for RA) and `@code{_d_m_s}' (for Dec), where the `@code{_}'s
are place-holders for numbers.
-In this case, the string will be immediately converted to a single floating
point number (in units of degrees) and stored in memory with the rest of the
column or table.
-Besides being useful in large tables, with this feature, conversion to
sexagesimal coordinates to degrees becomes very easy, for example:
-@example
-echo "7h34m35.5498 31d53m14.352s" | asttable
-@end example
-
-@noindent
-The inverse can also be done with the more general column arithmetic
-operators:
-@example
-echo "113.64812416667 31.88732" \
- | asttable -c'arith $1 degree-to-ra $2 degree-to-dec'
-@end example
-
-@noindent
-If you want to preserve the sexagesimal contents of a column, you should store
that column as a string, see @ref{Gnuastro text table format}.
+@strong{Vector columns and FITS ASCII tables:} As mentioned above, the FITS
standard only recognizes vector columns in its Binary table format (the default
FITS table format in Gnuastro).
+You can still use the @option{--tableformat=fits-ascii} option to write your
tables in the FITS ASCII format (see @ref{Input output options}).
+In this case, if a vector column is present, it will be written as separate
single-element columns to avoid loosing information (as if you run called
@option{--fromvector} on all the elements of the vector column).
+A warning is printed if this occurs.
@end cartouche
-@table @option
+For an application of the vector column concepts introduced here on MUSE data,
see the 3D data cube tutorial and in particular these two sections: @ref{3D
measurements and spectra} and @ref{Extracting a single spectrum and plotting
it}.
-@item -i
-@itemx --information
-Only print the column information in the specified table on the command-line
and exit.
-Each column's information (number, name, units, data type, and comments) will
be printed as a row on the command-line.
-If the column is a multi-value (vector) a @code{[N]} is printed after the
type, where @code{N} is the number of elements within that vector.
+@node Column arithmetic, Operation precedence in Table, Vector columns, Table
+@subsection Column arithmetic
-Note that the FITS standard only requires the data type (see @ref{Numeric data
types}), and in plain text tables, no meta-data/information is mandatory.
-Gnuastro has its own convention in the comments of a plain text table to store
and transfer this information as described in @ref{Gnuastro text table format}.
+In many scenarios, you want to apply some kind of operation on the columns and
save them in another table or feed them into another program.
+With Table you can do a rich set of operations on the contents of one or more
columns in a table, and save the resulting values as new column(s) in the
output table.
+For seeing the precedence of Column arithmetic in relation to other Table
operators, see @ref{Operation precedence in Table}.
-This option will take precedence over all other operations in Table, so when
it is called along with other operations, they will be ignored, see
@ref{Operation precedence in Table}.
-This can be useful if you forget the identifier of a column after you have
already typed some on the command-line.
-You can simply add a @option{-i} to your already-written command (without
changing anything) and run Table, to see the whole list of column names and
information.
-Then you can use the shell history (with the up arrow key on the keyboard),
and retrieve the last command with all the previously typed columns present,
delete @option{-i} and add the identifier you had forgot.
+To enable column arithmetic, the first 6 characters of the value to
@option{--column} (@code{-c}) should be the activation word `@option{arith }'
(note the space character in the end, after `@code{arith}').
+After the activation word, you can use reverse polish notation to identify the
operators and their operands, see @ref{Reverse polish notation}.
+Just note that white-space characters are used between the tokens of the
arithmetic expression and that they are meaningful to the command-line
environment.
+Therefore the whole expression (including the activation word) has to be
quoted on the command-line or in a shell script (see the examples below).
-@item --info-num-cols
-Similar to @option{--information}, but only the number of the input table's
columns will be printed as a single integer (useful in scripts for example).
+To identify a column you can directly use its name, or specify its number
(counting from one, see @ref{Selecting table columns}).
+When you are giving a column number, it is necessary to prefix the number with
a @code{$}, similar to AWK.
+Otherwise the number is not distinguishable from a constant number to use in
the arithmetic operation.
-@item --info-num-rows
-Similar to @option{--information}, but only the number of the input table's
rows will be printed as a single integer (useful in scripts for example).
+For example, with the command below, the first two columns of
@file{table.fits} will be printed along with a third column that is the result
of multiplying the first column with @mymath{10^{10}} (for example, to convert
wavelength from Meters to Angstroms).
+Note that without the `@key{$}', it is not possible to distinguish between
``1'' as a column-counter, or ``1'' as a constant number to use in the
arithmetic operation.
+Also note that because of the significance of @key{$} for the command-line
environment, the single-quotes are the recommended quoting method (as in an AWK
expression), not double-quotes (for the significance of using single quotes see
the box below).
-@cindex AWK
-@cindex GNU AWK
-@item -c STR/INT
-@itemx --column=STR/INT
-Set the output columns either by specifying the column number, or name.
-For more on selecting columns, see @ref{Selecting table columns}.
-If a value of this option starts with `@code{arith }', column arithmetic will
be activated, allowing you to edit/manipulate column contents.
-For more on column arithmetic see @ref{Column arithmetic}.
+@example
+$ asttable table.fits -c1,2 -c'arith $1 1e10 x'
+@end example
-To ask for multiple columns this option can be used in two ways: 1) multiple
calls to this option, 2) using a comma between each column specifier in one
call to this option.
-These different solutions may be mixed in one call to Table: for example,
`@option{-cRA,DEC,MAG}', or `@option{-cRA,DEC -cMAG}' are both equivalent to
`@option{-cRA -cDEC -cMAG}'.
-The order of the output columns will be the same order given to the option or
in the configuration files (see @ref{Configuration file precedence}).
+@cartouche
+@noindent
+@strong{Single quotes when string contains @key{$}}: On the command-line, or
in shell-scripts, @key{$} is used to expand variables, for example, @code{echo
$PATH} prints the value (a string of characters) in the variable @code{PATH},
it will not simply print @code{$PATH}.
+This operation is also permitted within double quotes, so @code{echo "$PATH"}
will produce the same output.
+This is good when printing values, for example, in the command below,
@code{$PATH} will expand to the value within it.
-This option is not mandatory, if no specific columns are requested, all the
input table columns are output.
-When this option is called multiple times, it is possible to output one column
more than once.
+@example
+$ echo "My path is: $PATH"
+@end example
-@item -w FITS
-@itemx --wcsfile=FITS
-FITS file that contains the WCS to be used in the @code{wcs-to-img} and
@code{img-to-wcs} operators of @ref{Column arithmetic}.
-The extension name/number within the FITS file can be specified with
@option{--wcshdu}.
+If you actually want to return the literal string @code{$PATH}, not the value
in the @code{PATH} variable (like the scenario here in column arithmetic), you
should put it in single quotes like below.
+The printed value here will include the @code{$}, please try it to see for
yourself and compare to above.
-If the value to this option is `@option{none}', no WCS will be written in the
output.
+@example
+$ echo 'My path is: $PATH'
+@end example
-@item -W STR
-@itemx --wcshdu=STR
-FITS extension/HDU in the FITS file given to @option{--wcsfile} (see the
description of @option{--wcsfile} for more).
+Therefore, when your column arithmetic involves the @key{$} sign (to specify
columns by number), quote your @code{arith } string with a single quotation
mark.
+Otherwise you can use both single or double quotes.
+@end cartouche
-@item -L FITS/TXT
-@itemx --catcolumnfile=FITS/TXT
-Concatenate (or add, or append) the columns of this option's value (a
filename) to the output columns.
-This option may be called multiple times (to add columns from more than one
file into the final output), the columns from each file will be added in the
same order that this option is called.
-The number of rows in the file(s) given to this option has to be the same as
the input table (before any type of row-selection), see @ref{Operation
precedence in Table}.
-By default all the columns of the given file will be appended, if you only
want certain columns to be appended, use the @option{--catcolumns} option to
specify their name or number (see @ref{Selecting table columns}).
-Note that the columns given to @option{--catcolumns} must be present in all
the given files (if this option is called more than once with more than one
file).
-If the file given to this option is a FITS file, it is necessary to also
define the corresponding HDU/extension with @option{--catcolumnhdu}.
-Also note that no operation (such as row selection and arithmetic) is applied
to the table given to this option.
-If the appended columns have a name, and their name is already present in the
table before adding those columns, the column names of each file will be
appended with a @code{-N}, where @code{N} is a counter starting from 1 for each
appended table.
-Just note that in the FITS standard (and thus in Gnuastro), column names are
not case-sensitive.
+@cartouche
+@noindent
+@strong{Manipulate all columns in one call using @key{$_all}}: Usually we
manipulate one column in one call of column arithmetic.
+For instance, with the command below the elements of the @code{AWAV} column
will be sumed.
-This is done because when concatenating columns from multiple tables (more
than two) into one, they may have the same name, and it is not good practice to
have multiple columns with the same name.
-You can disable this feature with @option{--catcolumnrawname}.
-Generally, you can use the @option{--colmetadata} option to update column
metadata in the same command, after all the columns have been concatenated.
+@example
+$ asttable table.fits -c'arith AWAV sumvalue'
+@end example
-For example, let's assume you have two catalogs of the same objects (same
number of rows) in different filters.
-Such that @file{f160w-cat.fits} has a @code{MAGNITUDE} column that has the
magnitude of each object in the @code{F160W} filter and similarly
@file{f105w-cat.fits}, also has a @code{MAGNITUDE} column, but for the
@code{F105W} filter.
-You can use column concatenation like below to import the @code{MAGNITUDE}
column from the @code{F105W} catalog into the @code{F160W} catalog, while
giving each magnitude column a different name:
+But sometimes, we want to manipulate more than one column with the same
expression.
+For example we want to sum all the elements of all the columns.
+In this case we could use the following command (assuming that the table has
four different @code{AWAV-*} columns):
@example
-asttable f160w-cat.fits --output=both.fits \
- --catcolumnfile=f105w-cat.fits --catcolumns=MAGNITUDE \
- --colmetadata=MAGNITUDE,MAG-F160W,log,"Magnitude in F160W" \
- --colmetadata=MAGNITUDE-1,MAG-F105W,log,"Magnitude in F105W"
+$ asttable table.fits -c'arith AWAV-1 sumvalue' \
+ -c'arith AWAV-2 sumvalue' \
+ -c'arith AWAV-3 sumvalue' \
+ -c'arith AWAV-4 sumvalue'
@end example
-@noindent
-For a more complete example, see @ref{Working with catalogs estimating colors}.
+To avoid repetition and mistakes, instead of using column arithmetic many
times, we can use the @code{$_all} identifier.
+When column arithmetic confronts this special string, it will repeat the
expression for all the columns in the input table.
+Therefore the command above can be written as:
+
+@example
+$ asttable table.fits -c'arith $_all sumvalue'
+@end example
-@cartouche
-@noindent
-@strong{Loading external columns with Arithmetic:} an alternative way to load
external columns into your output is to use column arithmetic (@ref{Column
arithmetic})
-In particular the @option{load-col-} operator described in @ref{Loading
external columns}.
-But this operator will load only one column per file/HDU every time it is
called.
-So if you have many columns to insert, it is much faster to use
@option{--catcolumnfile}.
-Because @option{--catcolumnfile} will load all the columns in one opening of
the file, and possibly even read them all into memory in parallel!
@end cartouche
-@item -u STR/INT
-@itemx --catcolumnhdu=STR/INT
-The HDU/extension of the FITS file(s) that should be concatenated, or
appended, by column with @option{--catcolumnfile}.
-If @option{--catcolumn} is called more than once with more than one FITS file,
it is necessary to call this option more than once.
-The HDUs will be loaded in the same order as the FITS files given to
@option{--catcolumnfile}.
+Alternatively, if the columns have meta-data and the first two are
respectively called @code{AWAV} and @code{SPECTRUM}, the command above is
equivalent to the command below.
+Note that the character `@key{$}' is no longer necessary in this scenario
(because names will not be confused with numbers):
-@item -C STR/INT
-@itemx --catcolumns=STR/INT
-The column(s) in the file(s) given to @option{--catcolumnfile} to append.
-When this option is not given, all the columns will be concatenated.
-See @option{--catcolumnfile} for more.
+@example
+$ asttable table.fits -cAWAV,SPECTRUM -c'arith AWAV 1e10 x'
+@end example
-@item --catcolumnrawname
-Do Not modify the names of the concatenated (appended) columns, see
description in @option{--catcolumnfile}.
+Comparison of the two commands above clearly shows why it is recommended to
use column names instead of numbers.
+When the columns have descriptive names, the command/script actually becomes
much more readable, describing the intent of the operation.
+It is also independent of the low-level table structure: for the second
command, the column numbers of the @code{AWAV} and @code{SPECTRUM} columns in
@file{table.fits} is irrelevant.
-@item --transpose
-Transpose (as in a matrix) the given vector column(s) individually.
-When this operation is done (see @ref{Operation precedence in Table}), only
vector columns of the same data type and with the same number of elements
should exist in the table.
-A usage of this operator is presented in the IFU spectroscopy tutorial in
@ref{Extracting a single spectrum and plotting it}.
+Column arithmetic changes the values of the data within the column.
+So the old column metadata cannot be used any more.
+By default the output column of the arithmetic operation will be given a
generic metadata (for example, its name will be @code{ARITH_1}, which is hardly
useful!).
+But metadata are critically important and it is good practice to always have
short, but descriptive, names for each columns, units and also some comments
for more explanation.
+To add metadata to a column, you can use the @option{--colmetadata} option
that is described in @ref{Invoking asttable} and @ref{Operation precedence in
Table}.
-As a generic example, see the commands below.
-The @file{in.txt} table below has two vector columns (each with three
elements) in two rows.
-After running @command{asttable} with @option{--transpose}, you can see how
the vector columns have two elements per row (@code{u8(3)} has been replaced by
@code{u8(2)}), and that the table now has three rows.
+Since the arithmetic expressions are a value to @option{--column}, it does not
necessarily have to be a separate option, so the commands above are also
identical to the command below (note that this only has one @option{-c} option).
+Just be very careful with the quoting!
+With the @option{--colmetadata} option, we are also giving a name, units and a
comment to the third column.
@example
-$ cat in.txt
-# Column 1: abc [nounits,u8(3),] First vector column.
-# Column 2: def [nounits,u8(3),] Second vector column.
-111 112 113 211 212 213
-121 122 123 221 222 223
-
-$ asttable in.txt --transpose -O
-# Column 1: abc [nounits,u8(2),] First vector column.
-# Column 2: def [nounits,u8(2),] Second vector column.
-111 121 211 221
-112 122 212 222
-113 123 213 223
+$ asttable table.fits -cAWAV,SPECTRUM,'arith AWAV 1e10 x' \
+ --colmetadata=3,AWAV_A,angstrom,"Wavelength (in Angstroms)"
@end example
-@item --fromvector=STR,INT[,INT[,INT]]
-Extract the given tokens/elements from the given vector column into separate
single-valued columns.
-The input vector column can be identified by its name or counter, see
@ref{Selecting table columns}.
-After the columns are extracted, the input vector is deleted by default.
-To preserve the input vector column, you can use @option{--keepvectfin}
described below.
-For a complete usage scenario see @ref{Vector columns}.
+In case you need to append columns from other tables (with
@option{--catcolumnfile}), you can use those extra columns in column arithmetic
also.
+The easiest, and most robust, way is that your columns of interest (in all
files whose columns are to be merged) have different names.
+In this scenario, you can simply use the names of the columns you plan to
append.
+If there are similar names, note that by default Table appends a @code{-N} to
similar names (where @code{N} is the file counter given to
@option{--catcolumnfile}, see the description of @option{--catcolumnfile} for
more).
+Using column numbers can get complicated: if the number is smaller than the
main input's number of columns, the main input's column will be used.
+Otherwise (when the requested column number is larger than the main input's
number of columns), the final output (after appending all the columns from all
the possible files) column number will be used.
-@item --tovector=STR/INT,STR/INT[,STR/INT]
-Move the given columns into a newly created vector column.
-The given columns can be identified by their name or counter, see
@ref{Selecting table columns}.
-After the columns are copied, they are deleted by default.
-To preserve the inputs, you can use @option{--keepvectfin} described below.
-For a complete usage scenario see @ref{Vector columns}.
+Almost all the arithmetic operators of @ref{Arithmetic operators} are also
supported for column arithmetic in Table.
+In particular, the few that are not present in the Gnuastro
library@footnote{For a list of the Gnuastro library arithmetic operators,
please see the macros starting with @code{GAL_ARITHMETIC_OP} and ending with
the operator name in @ref{Arithmetic on datasets}.} are not yet supported for
column arithmetic.
+Besides the operators in @ref{Arithmetic operators}, several operators are
only available in Table to use on table columns.
-@item -k
-@itemx --keepvectfin
-Do not delete the input column(s) when using @option{--fromvector} or
@option{--tovector}.
-@item -R FITS/TXT
-@itemx --catrowfile=FITS/TXT
-Add the rows of the given file to the output table.
-The selected columns in the tables given to this option should have the same
number and datatype and the rows before control reaches this phase (after
column selection and column concatenation), for more see @ref{Operation
precedence in Table}.
+@cindex WCS: World Coordinate System
+@cindex World Coordinate System (WCS)
+@table @code
+@item wcs-to-img
+Convert the given WCS positions to image/dataset coordinates based on the
number of dimensions in the WCS structure of @option{--wcshdu} extension/HDU in
@option{--wcsfile}.
+It will output the same number of columns.
+The first popped operand is the last FITS dimension.
-For example, if @file{a.fits}, @file{b.fits} and @file{c.fits} have the
columns @code{RA}, @code{DEC} and @code{MAGNITUDE} (possibly in different
column-numbers in their respective table, along with many more columns), the
command below will add their rows into the final output that will only have
these three columns:
+For example, the two commands below (which have the same output) will produce
5 columns.
+The first three columns are the input table's ID, RA and Dec columns.
+The fourth and fifth columns will be the pixel positions in @file{image.fits}
that correspond to each RA and Dec.
@example
-$ asttable a.fits --catrowfile=b.fits --catrowhdu=1 \
- --catrowfile=c.fits --catrowhdu=1 \
- -cRA,DEC,MAGNITUDE --output=allrows.fits
+$ asttable table.fits -cID,RA,DEC,'arith RA DEC wcs-to-img' \
+ --wcsfile=image.fits
+$ asttable table.fits -cID,RA -cDEC \
+ -c'arith RA DEC wcs-to-img' --wcsfile=image.fits
@end example
-@cartouche
-@cindex Provenance
-@noindent
-@strong{Provenance of each row:} When merging rows from separate catalogs, it
is important to keep track of the source catalog of each row (its provenance).
-To do this, you can use @option{--catrowfile} in combination with the
@code{constant} operator and @ref{Column arithmetic}.
-For a working example of this scenario, see the example within the
documentation of the @code{constant} operator in @ref{Building new dataset and
stack management}.
-@end cartouche
-
-@cartouche
-@noindent
-@strong{How to avoid repetition when adding rows:} this option will simply add
the rows of multiple tables into one, it does not check their contents!
-Therefore if you use this option on multiple catalogs that may have some
shared physical objects in some of their rows, those rows/objects will be
repeated in the final table.
-In such scenarios, to avoid potential repetition, it is better to use
@ref{Match} (with @option{--notmatched} and @option{--outcols=AAA,BBB}) instead
of Table.
-For more on using Match for this scenario, see the description of
@option{--outcols} in @ref{Invoking astmatch}.
-@end cartouche
+@item img-to-wcs
+Similar to @code{wcs-to-img}, except that image/dataset coordinates are
converted to WCS coordinates.
-@item -X STR
-@itemx --catrowhdu=STR
-The HDU/extension of the FITS file(s) that should be concatenated, or
appended, by rows with @option{--catrowfile}.
-If @option{--catrowfile} is called more than once with more than one FITS
file, it is necessary to call this option more than once also (once for every
FITS table given to @option{--catrowfile}).
-The HDUs will be loaded in the same order as the FITS files given to
@option{--catrowfile}.
+@item distance-flat
+Return the distance between two points assuming they are on a flat surface.
+Note that each point needs two coordinates, so this operator needs four
operands (currently it only works for 2D spaces).
+The first and second popped operands are considered to belong to one point and
the third and fourth popped operands to the second point.
-@item -O
-@itemx --colinfoinstdout
-@cindex Standard output
-Add column metadata when the output is printed in the standard output.
-Usually the standard output is used for a fast visual check, or to pipe into
other metadata-agnostic programs (like AWK) for further processing.
-So by default meta-data are not included.
-But when piping to other Gnuastro programs (where metadata can be interpreted
and used) it is recommended to use this option and use column names in the next
program.
+Each of the input points can be a single coordinate or a full table column
(containing many points).
+In other words, the following commands are all valid:
-@item -r STR,FLT:FLT
-@itemx --range=STR,FLT:FLT
-Only output rows that have a value within the given range in the @code{STR}
column (can be a name or counter).
-Note that the range is only inclusive in the lower-limit.
-For example, with @code{--range=sn,5:20} the output's columns will only
contain rows that have a value in the @code{sn} column (not case-sensitive)
that is greater or equal to 5, and less than 20.
-Also you can use the comma for separating the values such as this
@code{--range=sn,5,20}.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@example
+$ asttable table.fits \
+ -c'arith X1 Y1 X2 Y2 distance-flat'
+$ asttable table.fits \
+ -c'arith X Y 12.345 6.789 distance-flat'
+$ asttable table.fits \
+ -c'arith 12.345 6.789 X Y distance-flat'
+@end example
-This option can be called multiple times (different ranges for different
columns) in one run of the Table program.
-This is very useful for selecting the final rows from multiple
criteria/columns.
+In the first case we are assuming that @file{table.fits} has the following
four columns @code{X1}, @code{Y1}, @code{X2}, @code{Y2}.
+The returned column by this operator will be the difference between two points
in each row with coordinates like the following (@code{X1}, @code{Y1}) and
(@code{X2}, @code{Y2}).
+In other words, for each row, the distance between different points is
calculated.
+In the second and third cases (which are identical), it is assumed that
@file{table.fits} has the two columns @code{X} and @code{Y}.
+The returned column by this operator will be the difference of each row with
the fixed point at (12.345, 6.789).
-The chosen column does not have to be in the output columns.
-This is good when you just want to select using one column's values, but do
not need that column anymore afterwards.
+@item distance-on-sphere
+Return the spherical angular distance (along a great circle, in degrees)
between the given two points.
+Note that each point needs two coordinates (in degrees), so this operator
needs four operands.
+The first and second popped operands are considered to belong to one point and
the third and fourth popped operands to the second point.
-For one example of using this option, see the example under
-@option{--sigclip-median} in @ref{Invoking aststatistics}.
+Each of the input points can be a single coordinate or a full table column
(containing many points).
+In other words, the following commands are all valid:
-@item --inpolygon=STR1,STR2
-Only return rows where the given coordinates are inside the polygon specified
by the @option{--polygon} option.
-The coordinate columns are the given @code{STR1} and @code{STR2} columns, they
can be a column name or counter (see @ref{Selecting table columns}).
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@example
+$ asttable table.fits \
+ -c'arith RA1 DEC1 RA2 DEC2 distance-on-sphere'
+$ asttable table.fits \
+ -c'arith RA DEC 9.876 5.432 distance-on-sphere'
+$ asttable table.fits \
+ -c'arith 9.876 5.432 RA DEC distance-on-sphere'
+@end example
-Note that the chosen columns does not have to be in the output columns (which
are specified by the @code{--column} option).
-For example, if we want to select rows in the polygon specified in
@ref{Dataset inspection and cropping}, this option can be used like this (you
can remove the double quotations and write them all in one line if you remove
the white-spaces around the colon separating the column vertices):
+In the first case we are assuming that @file{table.fits} has the following
four columns @code{RA1}, @code{DEC1}, @code{RA2}, @code{DEC2}.
+The returned column by this operator will be the difference between two points
in each row with coordinates like the following (@code{RA1}, @code{DEC1}) and
(@code{RA2}, @code{DEC2}).
+In other words, for each row, the angular distance between different points is
calculated.
+In the second and third cases (which are identical), it is assumed that
@file{table.fits} has the two columns @code{RA} and @code{DEC}.
+The returned column by this operator will be the difference of each row with
the fixed point at (9.876, 5.432).
+
+The distance (along a great circle) on a sphere between two points is
calculated with the equation below, where @mymath{r_1}, @mymath{r_2},
@mymath{d_1} and @mymath{d_2} are the right ascensions and declinations of
points 1 and 2.
+
+@dispmath {\cos(d)=\sin(d_1)\sin(d_2)+\cos(d_1)\cos(d_2)\cos(r_1-r_2)}
+@item ra-to-degree
+Convert the hour-wise Right Ascension (RA) string, in the sexagesimal format
of @code{_h_m_s} or @code{_:_:_}, to degrees.
+Note that the input column has to have a string format.
+In FITS tables, string columns are well-defined.
+For plain-text tables, please follow the standards defined in @ref{Gnuastro
text table format}, otherwise the string column will not be read.
@example
-asttable table.fits --inpolygon=RA,DEC \
- --polygon="53.187414,-27.779152 \
- : 53.159507,-27.759633 \
- : 53.134517,-27.787144 \
- : 53.161906,-27.807208" \
+$ asttable catalog.fits -c'arith RA ra-to-degree'
+$ asttable catalog.fits -c'arith $5 ra-to-degree'
@end example
-@cartouche
-@noindent
-@strong{Flat/Euclidean space: } The @option{--inpolygon} option assumes a
flat/Euclidean space so it is only correct for RA and Dec when the polygon size
is very small like the example above.
-If your polygon is a degree or larger, it may not return correct results.
-Please get in touch if you need such a feature (see @ref{Suggest new feature}).
-@end cartouche
-
-@item --outpolygon=STR1,STR2
-Only return rows where the given coordinates are outside the polygon specified
by the @option{--polygon} option.
-This option is very similar to the @option{--inpolygon} option, so see the
description there for more.
+@item dec-to-degree
+Convert the sexagesimal Declination (Dec) string, in the format of
@code{_d_m_s} or @code{_:_:_}, to degrees (a single floating point number).
+For more details please see the @option{ra-to-degree} operator.
-@item --polygon=STR
-@itemx --polygon=FLT,FLT:FLT,FLT:...
-The polygon to use for the @code{--inpolygon} and @option{--outpolygon}
options.
-This option is parsed in an identical way to the same option in the Crop
program, so for more information on how to use it, see @ref{Crop options}.
+@item degree-to-ra
+@cindex Sexagesimal
+@cindex Right Ascension
+Convert degrees (a column with a single floating point number) to the Right
Ascension, RA, string (in the sexagesimal format hours, minutes and seconds,
written as @code{_h_m_s}).
+The output will be a string column so no further mathematical operations can
be done on it.
+The output file can be in any format (for example, FITS or plain-text).
+If it is plain-text, the string column will be written following the standards
described in @ref{Gnuastro text table format}.
-@item -e STR,INT/FLT,...
-@itemx --equal=STR,INT/FLT,...
-Only output rows that are equal to the given number(s) in the given column.
-The first argument is the column identifier (name or number, see
@ref{Selecting table columns}), after that you can specify any number of values.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@item degree-to-dec
+@cindex Declination
+Convert degrees (a column with a single floating point number) to the
Declination, Dec, string (in the format of @code{_d_m_s}).
+See the @option{degree-to-ra} for more on the format of the output.
-For example, @option{--equal=ID,5,6,8} will only print the rows that have a
value of 5, 6, or 8 in the @code{ID} column.
-This option can also be called multiple times, so @option{--equal=ID,4,5
--equal=ID,6,7} has the same effect as @option{--equal=4,5,6,7}.
+@item date-to-sec
+@cindex Unix epoch time
+@cindex Time, Unix epoch
+@cindex Epoch, Unix time
+Return the number of seconds from the Unix epoch time (00:00:00 Thursday,
January 1st, 1970).
+The input (popped) operand should be a string column in the FITS date format
(most generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}).
-@cartouche
-@noindent
-@strong{Equality and floating point numbers:} Floating point numbers are only
approximate values (see @ref{Numeric data types}).
-In this context, their equality depends on how the input table was originally
stored (as a plain text table or as an ASCII/binary FITS table).
-If you want to select floating point numbers, it is strongly recommended to
use the @option{--range} option and set a very small interval around your
desired number, do not use @option{--equal} or @option{--notequal}.
-@end cartouche
+The returned operand will be named @code{UNIXSEC} (short for Unix-seconds) and
will be a 64-bit, signed integer, see @ref{Numeric data types}.
+If the input string has sub-second precision, it will be ignored because
floating point numbers cannot accurately store numbers with many significant
digits.
+To preserve sub-second precision, please use @code{date-to-millisec}.
-The @option{--equal} and @option{--notequal} options also work when the given
column has a string type.
-In this case the given value to the option will also be parsed as a string,
not as a number.
-When dealing with string columns, be careful with trailing white space
characters (the actual value maybe adjusted to the right, left, or center of
the column's width).
-If you need to account for such white spaces, you can use shell quoting.
-For example, @code{--equal=NAME," myname "}.
+For example, in the example below we are using this operator, in combination
with the @option{--keyvalue} option of the Fits program, to sort your desired
FITS files by observation date (value in the @code{DATE-OBS} keyword in example
below):
-@cartouche
-@noindent
-@strong{Strings with a comma (,):} When your desired column values contain a
comma, you need to put a `@code{\}' before the internal comma (within the
value).
-Otherwise, the comma will be interpreted as a delimiter between multiple
values, and anything after it will be interpreted as a separate string.
-For example, assume column @code{AB} of your @file{table.fits} contains this
value: `@code{cd,ef}' in your desired rows.
-To extract those rows, you should use the command below:
@example
-$ asttable table.fits --equal=AB,cd\,ef
+$ astfits *.fits --keyvalue=DATE-OBS --colinfoinstdout \
+ | asttable -cFILENAME,'arith DATE-OBS date-to-sec' \
+ --colinfoinstdout \
+ | asttable --sort=UNIXSEC
@end example
-@end cartouche
-
-@item -n STR,INT/FLT,...
-@itemx --notequal=STR,INT/FLT,...
-Only output rows that are @emph{not} equal to the given number(s) in the given
column.
-The first argument is the column identifier (name or number, see
@ref{Selecting table columns}), after that you can specify any number of values.
-For example, @option{--notequal=ID,5,6,8} will only print the rows where the
@code{ID} column does not have value of 5, 6, or 8.
-This option can also be called multiple times, so @option{--notequal=ID,4,5
--notequal=ID,6,7} has the same effect as @option{--notequal=4,5,6,7}.
-Be very careful if you want to use the non-equality with floating point
numbers, see the special note under @option{--equal} for more.
-This option also works when the given column has a string type, see the
description under @option{--equal} (above) for more.
+If you do not need to see the Unix-seconds any more, you can add a
@option{-cFILENAME} (short for @option{--column=FILENAME}) at the end.
+For more on @option{--keyvalue}, see @ref{Keyword inspection and manipulation}.
-@item -b STR[,STR[,STR]]
-@itemx --noblank=STR[,STR[,STR]]
-Only output rows that are @emph{not} blank in the given column of the
@emph{input} table.
-Like above, the columns can be specified by their name or number (counting
from 1).
-This option can be called multiple times, so @option{--noblank=MAG
--noblank=PHOTOZ} is equivalent to @option{--noblank=MAG,PHOTOZ}.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@item date-to-millisec
+Return the number of milli-seconds from the Unix epoch time (00:00:00
Thursday, January 1st, 1970).
+The input (popped) operand should be a string column in the FITS date format
(most generally: @code{YYYY-MM-DDThh:mm:ss.ddd...}, where @code{.ddd} is the
optional sub-second component).
-For example, if @file{table.fits} has blank values (NaN in floating point
types) in the @code{magnitude} and @code{sn} columns, with
@code{--noblank=magnitude,sn}, the output will not contain any rows with blank
values in these two columns.
+The returned operand will be named @code{UNIXMILLISEC} (short for Unix
milli-seconds) and will be a 64-bit, signed integer, see @ref{Numeric data
types}.
+The returned value is not a floating point type because for large numbers,
floating point data types loose single-digit precision (which is important
here).
-If you want @emph{all} columns to be checked, simply set the value to
@code{_all} (in other words: @option{--noblank=_all}).
-This mode is useful when there are many columns in the table and you want a
``clean'' output table (with no blank values in any column): entering their
name or number one-by-one can be buggy and frustrating.
-In this mode, no other column name should be given.
-For example, if you give @option{--noblank=_all,magnitude}, then Table will
assume that your table actually has a column named @code{_all} and
@code{magnitude}, and if it does not, it will abort with an error.
+Other than the units of the output, this operator behaves similarly to
@code{date-to-sec}.
+See the description of that operator for an example.
-If you want to change column values using @ref{Column arithmetic} (and set
some to blank, to later remove), or you want to select rows based on columns
that you have imported from other tables, you should use the
@option{--noblankend} option described below.
-Also, see @ref{Operation precedence in Table}.
+@item sorted-to-interval
+Given a single column (which must be already sorted and have a numeric data
type), return two columns: the first returned column is the minimum and the
second returned column is the maximum value of the interval of each row row.
+The maximum of each row is equal to the minimum of the previous row; thus
creating a contiguous interval coverage of the input column's range in all rows.
-@item -s STR
-@itemx --sort=STR
-Sort the output rows based on the values in the @code{STR} column (can be a
column name or number).
-By default the sort is done in ascending/increasing order, to sort in a
descending order, use @option{--descending}.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+The minimum value of the first row and maximum of the last row will be
smaller/larger than the respective row of the input (based on the distance to
the next/previous element).
+This is done to ensure that if your input has a fixed interval length between
all elements, the first and last intervals also have that fixed length.
-The chosen column does not have to be in the output columns.
-This is good when you just want to sort using one column's values, but do not
need that column anymore afterwards.
+For example, with the command below, we'll use this operator on a hypothetical
radial profile.
+Note how the intervals are contiguous even though the radius values are not
equally distant (if the row with a radius of 2.5 didn't exist, the intervals
would all be the same length).
+For another example of the usage of this operator, see the example in the
description of @option{--customtable} in @ref{MakeProfiles profile settings}.
-@item -d
-@itemx --descending
-When called with @option{--sort}, rows will be sorted in descending order.
+@example
+$ cat radial-profile.txt
+# Column 1: RADIUS [pix,f32,] Distance to center in pixels.
+# Column 2: MEAN [ADU,f32,] Mean value at that radius.
+0 100
+1 40
+2 30
+2.5 25
+3 20
-@item -H INT
-@itemx --head=INT
-Only print the given number of rows from the @emph{top} of the final table.
-Note that this option only affects the @emph{output} table.
-For example, if you use @option{--sort}, or @option{--range}, the printed rows
are the first @emph{after} applying the sort sorting, or selecting a range of
the full input.
-This option cannot be called with @option{--tail}, @option{--rowrange} or
@option{--rowrandom}.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+$ asttable radial-profile.txt --txtf32f=fixed --txtf32p=3 \
+ -c'arith RADIUS sorted-to-interval',MEAN
+-0.500 0.500 100.000
+0.500 1.500 40.000
+1.500 2.250 30.000
+2.250 2.750 25.000
+2.750 3.250 20.000
+@end example
-@cindex GNU Coreutils
-If the given value to @option{--head} is 0, the output columns will not have
any rows and if it is larger than the number of rows in the input table, all
the rows are printed (this option is effectively ignored).
-This behavior is taken from the @command{head} program in GNU Coreutils.
+Such intervals can be useful in scenarios like generating the input to
@option{--customtable} in MakeProfiles (see @ref{MakeProfiles profile
settings}) from a radial profile (see @ref{Generate radial profile}).
-@item -t INT
-@itemx --tail=INT
-Only print the given number of rows from the @emph{bottom} of the final table.
-See @option{--head} for more.
-This option cannot be called with @option{--head}, @option{--rowrange} or
@option{--rowrandom}.
+@end table
-@item --rowrange=INT,INT
-Only return the rows within the requested positional range (inclusive on both
sides).
-Therefore, @code{--rowrange=5,7} will return 3 of the input rows, row 5, 6 and
7.
-This option will abort if any of the given values is larger than the total
number of rows in the table.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@node Operation precedence in Table, Invoking asttable, Column arithmetic,
Table
+@subsection Operation precedence in Table
-With the @option{--head} or @option{--tail} options you can only see the top
or bottom few rows.
-However, with this option, you can limit the returned rows to a contiguous set
of rows in the middle of the table.
-Therefore this option cannot be called with @option{--head}, @option{--tail},
or @option{--rowrandom}.
+The Table program can do many operations on the rows and columns of the input
tables and they are not always applied in the order you call the operation on
the command-line.
+In this section we will describe which operation is done before/after which
operation.
+Knowing this precedence table is important to avoid confusion when you ask for
more than one operation.
+For a description of each option, please see @ref{Invoking asttable}.
+By default, column-based operations will be done first.
+You can ask for switching to row-based operations to be done first, using the
@option{--rowfirst} option.
-@item --rowrandom=INT
-@cindex Random row selection
-@cindex Row selection, by random
-Select @code{INT} rows from the input table by random (assuming a uniform
distribution).
-This option is applied @emph{after} the value-based selection options (such as
@option{--sort}, @option{--range}, and @option{--polygon}).
-On the other hand, only the row counters are randomly selected, this option
does not change the order.
-Therefore, if @option{--rowrandom} is called together with @option{--sort},
the returned rows are still sorted.
-This option cannot be called with @option{--head}, @option{--tail}, or
@option{--rowrange}.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@cartouche
+@noindent
+@strong{Pipes for different precedence:} It may happen that your desired
series of operations cannot be done with the precedence mentioned below (in one
command).
+In this case, you can pipe the output of one call to @command{asttable} to
another @command{asttable}.
+Just don't forget to give @option{-O} (or @option{--colinfoinstdout}) to the
first instance (so the column metadata are also passed to the next instance).
+Without metadata, all numbers will be read as double-precision (see
@ref{Gnuastro text table format}; recall that piping is done in plain text
format), vector columns will be broken into single-valued columns, and column
names, units and comments will be lost.
+At the end of this section, there is an example of doing this.
+@end cartouche
-This option will only have an effect if @code{INT} is larger than the number
of rows when it is activated (after the value-based selection options have been
applied).
-When there are fewer rows, a warning is printed, saying that this option has
no effect.
-The warning can be disabled with the @option{--quiet} option.
+@table @asis
+@item Input table information
+The first set of operations that will be preformed (if requested) are the
printing of the input table information.
+Therefore, when the following options are called, the column data are not read
at all.
+Table simply reads the main input's column metadata (name, units, numeric data
type and comments), and the number of rows and prints them.
+Table then terminates and no other operation is done.
+These can therefore be called at the end of an arbitrarily long Table command.
+When you have forgot some information about the input table.
+You can then delete these options and continue writing the command (using the
shell's history to retrieve the previous command with an up-arrow key).
-@cindex Reproducibility
-Due to its nature (to be random), the output of this option differs in each
run.
-Therefore 5 calls to Table with @option{--rowrandom} on the same input table
will generate 5 different outputs.
-If you want a reproducible random selection, set the @code{GSL_RNG_SEED}
environment variable and also use the @option{--envseed} option, for more see
@ref{Generating random numbers}.
+At any time only a single one of the options in this category may be called.
+The order of checking for these options is therefore important: in the same
order that they are described below:
-@item --envseed
-Read the random number generator seed from the @code{GSL_RNG_SEED} environment
variable for @option{--rowrandom} (instead of generating a different seed
internally on every run).
-This is useful if you want a reproducible random selection of the input rows.
-For more, see @ref{Generating random numbers}.
+@table @asis
+@item Column and row information (@option{--information} or @option{-i})
+Print the list of input columns and the metadata of each column in a single
row.
+This includes the column name, numeric data type, units and comments of each
column within a separate row of the output.
+Finally, print the number of rows.
-@item -E STR[,STR[,STR]]
-@itemx --noblankend=STR[,STR[,STR]]
-Remove all rows in the requested @emph{output} columns that have a blank value.
-Like above, the columns can be specified by their name or number (counting
from 1).
-This option can be called multiple times, so @option{--noblank=MAG
--noblank=PHOTOZ} is equivalent to @option{--noblank=MAG,PHOTOZ}.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@item Number of columns (@option{--info-num-cols})
+Print the number of columns in the input table.
+Only a single integer (number of columns) is printed before Table terminates.
-for example, if your final output table (possibly after column arithmetic, or
adding new columns) has blank values (NaN in floating point types) in the
@code{magnitude} and @code{sn} columns, with @code{--noblankend=magnitude,sn},
the output will not contain any rows with blank values in these two columns.
+@item Number of rows (@option{--info-num-rows})
+Print the number of rows in the input table.
+Only a single integer (number of rows) is printed before Table terminates.
+@end table
-If you want blank values to be removed from the main input table _before_ any
further processing (like adding columns, sorting or column arithmetic), you
should use the @option{--noblank} option.
-With the @option{--noblank} option, the column(s) that is(are) given does not
necessarily have to be in the output (it is just temporarily used for reading
the inputs and selecting rows, but does not necessarily need to be present in
the output).
-However, the column(s) given to this option should exist in the output.
+@item Column selection (@option{--column})
+When this option is given, only the columns given to this option (from the
main input) will be used for all future steps.
+When @option{--column} (or @option{-c}) is not given, then all the main
input's columns will be used in the next steps.
-If you want @emph{all} columns to be checked, simply set the value to
@code{_all} (in other words: @option{--noblankend=_all}).
-This mode is useful when there are many columns in the table and you want a
``clean'' output table (with no blank values in any column): entering their
name or number one-by-one can be buggy and frustrating.
-In this mode, no other column name should be given.
-For example, if you give @option{--noblankend=_all,magnitude}, then Table will
assume that your table actually has a column named @code{_all} and
@code{magnitude}, and if it does not, it will abort with an error.
+@item Column-based operations
+By default the following column-based operations will be done before the
row-based operations in the next item.
+If you need to give precedence to row-based operations, use
@option{--rowfirst}.
-This option is applied just before writing the final table (after
@option{--colmetadata} has finished).
-So in case you changed the column metadata, or added new columns, you can use
the new names, or the newly defined column numbers.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@table @asis
-@item -m STR/INT,STR[,STR[,STR]]
-@itemx --colmetadata=STR/INT,STR[,STR[,STR]]
-Update the specified column metadata in the output table.
-This option is applied after all other column-related operations are complete,
for example, column arithmetic, or column concatenation.
-For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+@item Column(s) from other file(s): @option{--catcolumnfile}
+When column concatenation (addition) is requested, columns from other tables
(in other files, or other HDUs of the same FITS file) will be added after the
existing columns are read from the main input.
+In one command, you can call @option{--catcolumnfile} multiple times to allow
addition of columns from many files.
-The first value (before the first comma) given to this option is the column's
identifier.
-It can either be a counter (positive integer, counting from 1), or a name (the
column's name in the output if this option was not called).
+Therefore you can merge the columns of various tables into one table in this
step (at the start), then start adding/limiting the rows, or building vector
columns, .
+If any of the row-based operations below are requested in the same
@code{asttable} command, they will also be applied to the rows of the added
columns.
+However, the conditions to keep/reject rows can only be applied to the rows of
the columns in main input table (not the columns that are added with these
options).
-After the to-be-updated column is identified, at least one other string should
be given, with a maximum of three strings.
-The first string after the original name will the selected column's new name.
-The next (optional) string will be the selected column's unit and the third
(optional) will be its comments.
-If the two optional strings are not given, the original column's units or
comments will remain unchanged.
+@item Extracting single-valued columns from vectors (@option{--fromvector})
+Once all the input columns are read into memory, if any of them are vectors,
you can extract a single-valued column from the vector columns at this stage.
+For more on vector columns, see @ref{Vector columns}.
-If any of the values contains a comma, you should place a `@code{\}' before
the comma to avoid it getting confused with a delimiter.
-For example, see the command below for a column description that contains a
comma:
+@item Creating vector columns (@option{--tovector})
+After column arithmetic, there is no other way to add new columns so the
@option{--tovector} operator is applied at this stage.
+You can use it to merge multiple columns that are available in this stage to a
single vector column.
+For more, see @ref{Vector columns}.
-@example
-$ asttable table.fits \
- --colmetadata=NAME,UNIT,"Comments\, with a comma"
-@end example
+@item Column arithmetic
+Once the final rows are selected in the requested order, column arithmetic is
done (if requested).
+For more on column arithmetic, see @ref{Column arithmetic}.
-Generally, since the comma is commonly used as a delimiter in many scenarios,
to avoid complicating your future analysis with the table, it is best to avoid
using a comma in the column name and units.
+@end table
-Some examples of this option are available in the tutorials, in particular
@ref{Working with catalogs estimating colors}.
-Here are some more specific examples:
-@table @option
+@item Row-based operations
+Row-based operations only work within the rows of existing columns when they
are activated.
+By default row-based operations are activated after column-based operations
(which are mentioned above).
+If you need to give precedence to row-based operations, use
@option{--rowfirst}.
-@item --colmetadata=MAGNITUDE,MAG_F160W
-This will convert name of the original @code{MAGNITUDE} column to
@code{MAG_F160W}, leaving the unit and comments unchanged.
+@table @asis
+@item Rows from other file(s) (@option{--catrowfile})
+With this feature, you can import rows from other tables (in other files, or
other HDUs of the same FITS file).
+The same column selection of @option{--column} is applied to the tables given
to this option.
+The column metadata (name, units and comments) will be taken from the main
input.
+Two conditions are mandatory for adding rows:
+@itemize
+@item
+The number of columns used from the new tables must be equal to the number of
columns in memory, by the time control reaches here.
+@item
+The data type of each column (see @ref{Numeric data types}) should be the same
as the respective column in memory by the time control reaches here.
+If the data types are different, you can use the type conversion operators of
column arithmetic which has higher precedence (and will therefore be applied
before this by default).
+For more on type conversion, see @ref{Numerical type conversion operators} and
@ref{Column arithmetic}).
+@end itemize
-@item --colmetadata=3,MAG_F160W,mag
-This will convert name of the third column of the final output to
@code{MAG_F160W} and the units to @code{mag}, while leaving the comments
untouched.
+@item Row selection by value in a column
+The following operations select rows based on the values in them.
+A more complete description of each of these options is given in @ref{Invoking
asttable}.
-@item --colmetadata=MAGNITUDE,MAG_F160W,mag,"Magnitude in F160W filter"
-This will convert name of the original @code{MAGNITUDE} column to
@code{MAG_F160W}, and the units to @code{mag} and the comments to
@code{Magnitude in F160W filter}.
-Note the double quotations around the comment string, they are necessary to
preserve the white-space characters within the column comment from the
command-line, into the program (otherwise, upon reaching a white-space
character, the shell will consider this option to be finished and cause
un-expected behavior).
-@end table
+@itemize
+@item
+@option{--range}: only keep rows where the value in the given column is within
a certain interval.
+@item
+@option{--inpolygon}: only keep rows where the value is within the polygon of
@option{--polygon}.
+@item
+@option{--outpolygon}: only keep rows outside the polygon of
@option{--polygon}.
+@item
+@option{--equal}: only keep rows with an specified value in given column.
+@item
+@option{--notequal}: only keep rows without specified value in given column.
+@item
+@option{--noblank}: only keep rows that are not blank in the given column(s).
+@end itemize
-If your table is large and generated by a script, you can first do all your
operations on your table's data and write it into a temporary file (maybe
called @file{temp.fits}).
-Then, look into that file's metadata (with @command{asttable temp.fits -i}) to
see the exact column positions and possible names, then add the necessary calls
to this option to your previous call to @command{asttable}, so it writes proper
metadata in the same run (for example, in a script or Makefile).
-Recall that when a name is given, this option will update the metadata of the
first column that matches, so if you have multiple columns with the same name,
you can call this options multiple times with the same first argument to change
them all to different names.
+These options can be called any number of times (to limit the final rows based
on values in different columns for example).
+Since these are row-rejection operations, their internal order is irrelevant.
+In other words, it makes no difference if @option{--equal} is called before or
after @option{--range} for example.
-Finally, if you already have a FITS table by other means (for example, by
downloading) and you merely want to update the column metadata and leave the
data intact, it is much more efficient to directly modify the respective FITS
header keywords with @code{astfits}, using the keyword manipulation features
described in @ref{Keyword inspection and manipulation}.
-@option{--colmetadata} is mainly intended for scenarios where you want to edit
the data so it will always load the full/partial dataset into memory, then
write out the resulting datasets with updated/corrected metadata.
+As a side-effect, because NaN/blank values are defined to fail on any
condition, these operations will also remove rows with NaN/blank values in the
specified column they are checking.
+Also, the columns that are used for these operations do not necessarily have
to be in the final output table (you may not need the column after doing the
selection based on it).
+By default, these options are applied after merging columns from other tables.
+However, currently, the column given to these options can only come from the
main input table.
+If you need to apply these operations on columns from
@option{--catcolumnfile}, pipe the output of one instance of Table with
@option{--catcolumnfile} into another instance of Table as suggested in the box
above this list.
-@item -f STR
-@itemx --txtf32format=STR
-The plain-text format of 32-bit floating point columns when output is not
binary (this option is ignored for binary outputs like FITS tables, see
@ref{Printing floating point numbers}).
-The acceptable values are listed below.
-This is just the format of the plain-text outputs; see
@option{--txtf32precision} for customizing their precision.
-@table @code
-@item fixed
-Fixed-point notation (for example @code{123.4567}).
-@item exp
-Exponential notation (for example @code{1.234567e+02}).
-@end table
+These row-based operations options are applied first because the speed of
later operations can be greatly affected by the number of rows.
+For example, if you also call the @option{--sort} option, and your row
selection will result in 50 rows (from an input of 10000 rows), limiting the
number of rows first will greatly speed up the sorting in your final output.
-The default mode is @code{exp} since it is the most generic and will not cause
any loss of data.
-Be very cautious if you set it to @code{fixed}.
-As a rule of thumb, the fixed-point notation is only good if the numbers are
larger than 1.0, but not too large!
-Given that the total number of accurate decimal digits is fixed the more
digits you have on the left of the decimal point (integer part), the more
un-accurate digits will be printed on the right of the decimal point.
+@item Sorting (@option{--sort})
+Sort of the rows based on values in a certain column.
+The column to sort by can only come from the main input table columns (not
columns that may have been added with @option{--catcolumnfile}).
-@item -p STR
-@itemx --txtf32precision=INT
-Number of digits after (to the right side of) the decimal point (precision)
for columns with a 32-bit floating point datatype (this option is ignored for
binary outputs like FITS tables, see @ref{Printing floating point numbers}).
-This can take any positive integer (including 0).
-When given a value of zero, the floating point number will be rounded to the
nearest integer.
+@item Row selection (by position)
+@itemize
+@item
+@option{--head}: keep only requested number of top rows.
+@item
+@option{--tail}: keep only requested number of bottom rows.
+@item
+@option{--rowrandom}: keep only a random number of rows.
+@item
+@option{--rowrange}: keep only rows within a certain positional interval.
+@end itemize
-@cindex IEEE 754
-The default value to this option is 6.
-This is because according to IEEE 754, 32-bit floating point numbers can be
accurately presented to 7.22 decimal digits (see @ref{Printing floating point
numbers}).
-Since we only have an integer number of digits in a number, we'll round it to
7 decimal digits.
-Furthermore, the precision is only defined to the right side of the decimal
point.
-In exponential notation (default of @option{--txtf32format}), one decimal
digit will be printed on the left of the decimal point.
-So the default value to this option is @mymath{7-1=6}.
+These options limit/select rows based on their position within the table (not
their value in any certain column).
-@item -A STR
-@itemx --txtf64format=STR
-The plain-text format of 64-bit floating point columns when output is not
binary (this option is ignored for binary outputs like FITS tables, see
@ref{Printing floating point numbers}).
-The acceptable values are listed below.
-This is just the format of the plain-text outputs; see
@option{--txtf64precision} for customizing their precision.
-@table @code
-@item fixed
-Fixed-point notation (for example @code{12345.6789012345}).
-@item exp
-Exponential notation (for example @code{1.23456789012345e4}).
+@item Transpose vector columns (@option{--transpose})
+Transposing vector columns will not affect the number or metadata of columns,
it will just re-arrange them in their 2D structure.
+As a result, after transposing, the number of rows changes, as well as the
number of elements in each vector column.
+See the description of this option in @ref{Invoking asttable} for more (with
an example).
@end table
-The default mode is @code{exp} since it is the most generic and will not cause
any loss of data.
-Be very cautious if you set it to @code{fixed}.
-As a rule of thumb, the fixed-point notation is only good if the numbers are
larger than 1.0, but not too large!
-Given that the total number of accurate decimal digits is fixed the more
digits you have on the left of the decimal point (integer part), the more
un-accurate digits will be printed on the right of the decimal point.
+@item Column metadata (@option{--colmetadata})
+Once the structure of the final table is set, you can set the column metadata
just before finishing.
-@item -B STR
-@itemx --txtf64precision=INT
-Number of digits after the decimal point (precision) for columns with a 64-bit
floating point datatype (this option is ignored for binary outputs like FITS
tables, see @ref{Printing floating point numbers}).
-This can take any positive integer (including 0).
-When given a value of zero, the floating point number will be rounded to the
nearest integer.
+@item Output row selection (@option{--noblankend})
+Only keep the output rows that do not have a blank value in the given
column(s).
+For example, you may need to apply arithmetic operations on the columns
(through @ref{Column arithmetic}) before rejecting the undesired rows.
+After the arithmetic operation is done, you can use the @code{where} operator
to set the non-desired columns to NaN/blank and use @option{--noblankend}
option to remove them just before writing the output.
+In other scenarios, you may want to remove blank values based on columns in
another table.
+To help in readability, you can also use the final column names that you set
with @option{--colmetadata}!
+See the example below for applying any generic value-based row selection based
on @option{--noblankend}.
+@end table
-@cindex IEEE 754
-The default value to this option is 15.
-This is because according to IEEE 754, 64-bit floating point numbers can be
accurately presented to 15.95 decimal digits (see @ref{Printing floating point
numbers}).
-Since we only have an integer number of digits in a number, we'll round it to
16 decimal digits.
-Furthermore, the precision is only defined to the right side of the decimal
point.
-In exponential notation (default of @option{--txtf64format}), one decimal
digit will be printed on the left of the decimal point.
-So the default value to this option is @mymath{16-1=15}.
-
-@item -Y
-@itemx --txteasy
-When output is a plain-text file or just gets printed on standard output (the
terminal), all floating point columns are printed in fixed point notation (as
in @code{123.456}) instead of the default exponential notation (as in
@code{1.23456e+02}).
-For 32-bit floating points, this option will use a precision of 3 digits (see
@option{--txtf32precision}) and for 64-bit floating points use a precision of 6
digits (see @option{--txtf64precision}).
-This can be useful for human readability, but be careful with some scenarios
(for example @code{1.23e-120}, which will show only as @code{0.0}!).
-When this option is called any value given the following options is ignored:
@option{--txtf32format}, @option{--txtf32precision}, @option{--txtf64format}
and @option{--txtf64precision}.
-For example below you can see the output of table with and without this option:
+As an example, let's review how Table interprets the command below.
+We are assuming that @file{table.fits} contains at least three columns:
@code{RA}, @code{DEC} and @code{PARAM} and you only want the RA and Dec of the
rows where @mymath{p\times 2<5} (@mymath{p} is the value of each row in the
@code{PARAM} column).
@example
-$ asttable table.fits --head=5 -O
-# Column 1: OBJNAME [name ,str23, ] Name in HyperLeda.
-# Column 2: RAJ2000 [deg ,f64 , ] Right Ascension.
-# Column 3: DEJ2000 [deg ,f64 , ] Declination.
-# Column 4: RADIUS [arcmin,f32 , ] Major axis radius.
-NGC0884 2.3736267000000e+00 5.7138753300000e+01 8.994357e+00
-NGC1629 4.4935191000000e+00 -7.1838322400000e+01 5.000000e-01
-NGC1673 4.7109672000000e+00 -6.9820892700000e+01 3.499210e-01
-NGC1842 5.1216920000000e+00 -6.7273195300000e+01 3.999171e-01
-
-$ asttable table.fits --head=5 -O -Y
-# Column 1: OBJNAME [name ,str23, ] Name in HyperLeda.
-# Column 2: RAJ2000 [deg ,f64 , ] Right Ascension.
-# Column 3: DEJ2000 [deg ,f64 , ] Declination.
-# Column 4: RADIUS [arcmin,f32 , ] Major axis radius.
-NGC0884 2.373627 57.138753 8.994
-NGC1629 4.493519 -71.838322 0.500
-NGC1673 4.710967 -69.820893 0.350
-NGC1842 5.121692 -67.273195 0.400
+$ asttable table.fits -cRA,DEC --noblankend=MULTIP \
+ -c'arith PARAM 2 x set-i i i 5 gt nan where' \
+ --colmetadata=3,MULTIP,unit,"Description of column"
@end example
-This is also useful when you want to make outputs of other programs more
``easy'' to read, for example:
+@noindent
+Due to the precedence described in this section, Table does these operations
(which are independent of the order of the operations written on the
command-line):
+
+@enumerate
+@item
+At the start (with @code{-cRA,DEC}), Table reads the @code{RA} and @code{DEC}
columns.
+@item
+In between all the operations in the command above, Column arithmetic (with
@option{-c'arith ...'}) has the highest precedence.
+So the arithmetic operation is done and stored as a new (third) column.
+In this arithmetic operation, we multiply all the values of the @code{PARAM}
column by 2, then set all those with a value larger than 5 to NaN (for more on
understanding this operation, see the `@code{set-}' and `@code{where}'
operators in @ref{Arithmetic operators}).
+@item
+Updating column metadata (with @option{--colmetadata}) is then done to give a
name (@code{MULTIP}) to the newly calculated (third) column.
+During the process, besides a name, we also set a unit and description for the
new column.
+These metadata entries are @emph{very important}, so always be sure to add
metadata after doing column arithmetic.
+@item
+The lowest precedence operation is @option{--noblankend=MULTIP}.
+So only rows that are not blank/NaN in the @code{MULTIP} column are kept.
+@item
+Finally, the output table (with three columns) is written to the command-line.
+If you also want to print the column metadata, you can use the @option{-O} (or
@option{--colinfoinstdout}) option.
+Alternatively, if you want the output in a file, you can use the
@option{--output} option to save the table in FITS or plain-text format.
+@end enumerate
+
+It may happen that your desired operation needs a separate precedence.
+In this case you can pipe the output of Table into another call of Table and
use the @option{-O} (or @option{--colinfoinstdout}) option to preserve the
metadata between the two calls.
+
+For example, let's assume that you want to sort the output table from the
example command above based on the new @code{MULTIP} column.
+Since sorting is done prior to column arithmetic, you cannot do it in one
command, but you can circumvent this limitation by simply piping the output
(including metadata) to another call to Table:
@example
-$ echo 123.45678 | asttable
-1.234567800000000e+02
+asttable table.fits -cRA,DEC --noblankend=MULTIP --colinfoinstdout \
+ -c'arith PARAM 2 x set-i i i 5 gt nan where' \
+ --colmetadata=3,MULTIP,unit,"Description of column" \
+ | asttable --sort=MULTIP --output=selected.fits
+@end example
-$ echo 123.45678 | asttable -Y
-123.456780
+@node Invoking asttable, , Operation precedence in Table, Table
+@subsection Invoking Table
+
+Table will read/write, select, modify, or show the information of the rows and
columns in recognized Table formats (including FITS binary, FITS ASCII, and
plain text table files, see @ref{Tables}).
+Output columns can also be determined by number or regular expression matching
of column names, units, or comments.
+The executable name is @file{asttable} with the following general template
+
+@example
+$ asttable [OPTION...] InputFile
@end example
-@cartouche
@noindent
-@strong{Can result in loss of information}: be very careful with this option!
-It can loose precision or generally the full value if the value is not within
a "good" range like this example.
-Such cases are the reason that this is not the default format of plain-text
outputs.
+One line examples:
@example
-$ echo 123.4e-9 | asttable -Y
-0.000000
-@end example
-@end cartouche
-@end table
+## Get the table column information (name, data type, or units):
+$ asttable table.fits --information
+
+## Print columns named RA and DEC, followed by all the columns where
+## the name starts with "MAG_":
+$ asttable table.fits --column=RA --column=DEC --column=/^MAG_/
+## Similar to the above, but with one call to `--column' (or `-c'),
+## also sort the rows by the input's photometric redshift (`Z_PHOT')
+## column. To confirm the sort, you can add `Z_PHOT' to the columns
+## to print.
+$ asttable table.fits -cRA,DEC,/^MAG_/ --sort=Z_PHOT
+## Similar to the above, but only print rows that have a photometric
+## redshift between 2 and 3.
+$ asttable table.fits -cRA,DEC,/^MAG_/ --range=Z_PHOT,2:3
+## Only print rows with a value in the 10th column above 100000:
+$ asttable table.txt --range=10,10e5,inf
+## Only print the 2nd column, and the third column multiplied by 5,
+## Save the resulting two columns in `table.txt'
+$ asttable table.fits -c2,'arith $2 5 x' -otable.fits
+## Sort the output columns by the third column, save output:
+$ asttable table.fits --sort=3 -ooutput.txt
+## Subtract the first column from the second in `cat.txt' (can also
+## be a FITS table) and keep the third and fourth columns.
+$ asttable cat.txt -c'arith $2 $1 -',3,4 -ocat.fits
+## Convert sexagesimal coordinates to degrees (same can be done in a
+## large table given as argument).
+$ echo "7h34m35.5498 31d53m14.352s" | asttable
+## Convert RA and Dec in degrees to sexagesimal (same can be done in a
+## large table given as argument).
+echo "113.64812416667 31.88732" \
+ | asttable -c'arith $1 degree-to-ra $2 degree-to-dec'
+@end example
+Table's input dataset can be given either as a file or from Standard input
(piped from another program, see @ref{Standard input}).
+In the absence of selected columns, all the input's columns and rows will be
written to the output.
+The full set of operations Table can do are described in detail below, but for
a more high-level introduction to the various operations, and their precedence,
see @ref{Operation precedence in Table}.
+If any output file is explicitly requested (with @option{--output}) the output
table will be written in it.
+When no output file is explicitly requested the output table will be written
to the standard output.
+If the specified output is a FITS file, the type of FITS table (binary or
ASCII) will be determined from the @option{--tabletype} option.
+If the output is not a FITS file, it will be printed as a plain text table
(with space characters between the columns).
+When the output is not binary (for example standard output or a plain-text),
the @option{--txtf32*} or @option{--txtf64*} options can be used for the
formatting of floating point columns (see @ref{Printing floating point
numbers}).
+When the columns are accompanied by meta-data (like column name, units, or
comments), this information will also printed in the plain text file before the
table, as described in @ref{Gnuastro text table format}.
+For the full list of options common to all Gnuastro programs please see
@ref{Common options}.
+Options can also be stored in directory, user or system-wide configuration
files to avoid repeating on the command-line, see @ref{Configuration files}.
+Table does not follow Automatic output that is common in most Gnuastro
programs, see @ref{Automatic output}.
+Thus, in the absence of an output file, the selected columns will be printed
on the command-line with no column information, ready for redirecting to other
tools like @command{awk}.
+@cartouche
+@noindent
+@strong{Sexagesimal coordinates as floats in plain-text tables:}
+When a column is determined to be a floating point type (32-bit or 64-bit) in
a plain-text table, it can contain sexagesimal values in the format of
`@code{_h_m_s}' (for RA) and `@code{_d_m_s}' (for Dec), where the `@code{_}'s
are place-holders for numbers.
+In this case, the string will be immediately converted to a single floating
point number (in units of degrees) and stored in memory with the rest of the
column or table.
+Besides being useful in large tables, with this feature, conversion to
sexagesimal coordinates to degrees becomes very easy, for example:
+@example
+echo "7h34m35.5498 31d53m14.352s" | asttable
+@end example
+@noindent
+The inverse can also be done with the more general column arithmetic
+operators:
+@example
+echo "113.64812416667 31.88732" \
+ | asttable -c'arith $1 degree-to-ra $2 degree-to-dec'
+@end example
+@noindent
+If you want to preserve the sexagesimal contents of a column, you should store
that column as a string, see @ref{Gnuastro text table format}.
+@end cartouche
+@table @option
+@item -i
+@itemx --information
+Only print the column information in the specified table on the command-line
and exit.
+Each column's information (number, name, units, data type, and comments) will
be printed as a row on the command-line.
+If the column is a multi-value (vector) a @code{[N]} is printed after the
type, where @code{N} is the number of elements within that vector.
+Note that the FITS standard only requires the data type (see @ref{Numeric data
types}), and in plain text tables, no meta-data/information is mandatory.
+Gnuastro has its own convention in the comments of a plain text table to store
and transfer this information as described in @ref{Gnuastro text table format}.
+This option will take precedence over all other operations in Table, so when
it is called along with other operations, they will be ignored, see
@ref{Operation precedence in Table}.
+This can be useful if you forget the identifier of a column after you have
already typed some on the command-line.
+You can simply add a @option{-i} to your already-written command (without
changing anything) and run Table, to see the whole list of column names and
information.
+Then you can use the shell history (with the up arrow key on the keyboard),
and retrieve the last command with all the previously typed columns present,
delete @option{-i} and add the identifier you had forgot.
+@item --info-num-cols
+Similar to @option{--information}, but only the number of the input table's
columns will be printed as a single integer (useful in scripts for example).
-@node Query, , Table, Data containers
-@section Query
+@item --info-num-rows
+Similar to @option{--information}, but only the number of the input table's
rows will be printed as a single integer (useful in scripts for example).
-@cindex IVOA
-@cindex Query
-@cindex TAP (Table Access Protocol)
-@cindex ADQL (Astronomical Data Query Language)
-@cindex Astronomical Data Query Language (ADQL)
-There are many astronomical databases available for downloading astronomical
data.
-Most follow the International Virtual Observatory Alliance (IVOA,
@url{https://ivoa.net}) standards (and in particular the Table Access Protocol,
or TAP@footnote{@url{https://ivoa.net/documents/TAP}}).
-With TAP, it is possible to submit your queries via a command-line downloader
(for example, @command{curl}) to only get specific tables, targets (rows in a
table) or measurements (columns in a table): you do not have to download the
full table (which can be very large in some cases)!
-These customizations are done through the Astronomical Data Query Language
(ADQL@footnote{@url{https://ivoa.net/documents/ADQL}}).
+@cindex AWK
+@cindex GNU AWK
+@item -c STR/INT
+@itemx --column=STR/INT
+Set the output columns either by specifying the column number, or name.
+For more on selecting columns, see @ref{Selecting table columns}.
+If a value of this option starts with `@code{arith }', column arithmetic will
be activated, allowing you to edit/manipulate column contents.
+For more on column arithmetic see @ref{Column arithmetic}.
-Therefore, if you are sufficiently familiar with TAP and ADQL, you can easily
custom-download any part of an online dataset.
-However, you also need to keep a record of the URLs of each database and in
many cases, the commands will become long and hard/buggy to type on the
command-line.
-On the other hand, most astronomers do not know TAP or ADQL at all, and are
forced to go to the database's web page which is slow (it needs to download so
many images, and has too much annoying information), requires manual
interaction (further making it slow and buggy), and cannot be automated.
+To ask for multiple columns this option can be used in two ways: 1) multiple
calls to this option, 2) using a comma between each column specifier in one
call to this option.
+These different solutions may be mixed in one call to Table: for example,
`@option{-cRA,DEC,MAG}', or `@option{-cRA,DEC -cMAG}' are both equivalent to
`@option{-cRA -cDEC -cMAG}'.
+The order of the output columns will be the same order given to the option or
in the configuration files (see @ref{Configuration file precedence}).
-Gnuastro's Query program is designed to be the middle-man in this process: it
provides a simple high-level interface to let you specify your constraints on
what you want to download.
-It then internally constructs the command to download the data based on your
inputs and runs it to download your desired data.
-Query also prints the full command before it executes it (if not called with
@option{--quiet}).
-Also, if you ask for a FITS output table, the full command is written into its
0-th extension along with other input parameters to query (all Gnuastro
programs generally keep their input configuration parameters as FITS keywords
in the zero-th output).
-You can see it with Gnuastro's Fits program, like below:
+This option is not mandatory, if no specific columns are requested, all the
input table columns are output.
+When this option is called multiple times, it is possible to output one column
more than once.
-@example
-$ astfits query-output.fits -h0
-@end example
+@item -w FITS
+@itemx --wcsfile=FITS
+FITS file that contains the WCS to be used in the @code{wcs-to-img} and
@code{img-to-wcs} operators of @ref{Column arithmetic}.
+The extension name/number within the FITS file can be specified with
@option{--wcshdu}.
-With the full command used to download the dataset, you only need a minimal
knowledge of ADQL to do lower-level customizations on your downloaded dataset.
-You can simply copy that command and change the parts of the query string you
want: ADQL is very powerful!
-For example, you can ask the server to do mathematical operations on the
columns and apply selections after those operations, or combine/match multiple
datasets.
-We will try to add high-level interfaces for such capabilities, but generally,
do not limit yourself to the high-level operations (that cannot cover
everything!).
+If the value to this option is `@option{none}', no WCS will be written in the
output.
-@menu
-* Available databases:: List of available databases to Query.
-* Invoking astquery:: Inputs, outputs and configuration of Query.
-@end menu
+@item -W STR
+@itemx --wcshdu=STR
+FITS extension/HDU in the FITS file given to @option{--wcsfile} (see the
description of @option{--wcsfile} for more).
-@node Available databases, Invoking astquery, Query, Query
-@subsection Available databases
+@item -L FITS/TXT
+@itemx --catcolumnfile=FITS/TXT
+Concatenate (or add, or append) the columns of this option's value (a
filename) to the output columns.
+This option may be called multiple times (to add columns from more than one
file into the final output), the columns from each file will be added in the
same order that this option is called.
+The number of rows in the file(s) given to this option has to be the same as
the input table (before any type of row-selection), see @ref{Operation
precedence in Table}.
-The current list of databases supported by Query are listed at the end of this
section.
-To get the list of available datasets within each database, you can use the
@option{--information} option.
-for example, with the command below you can get a list of the roughly 100
datasets that are available within the ESA Gaia server with their description:
+By default all the columns of the given file will be appended, if you only
want certain columns to be appended, use the @option{--catcolumns} option to
specify their name or number (see @ref{Selecting table columns}).
+Note that the columns given to @option{--catcolumns} must be present in all
the given files (if this option is called more than once with more than one
file).
-@example
-$ astquery gaia --information
-@end example
+If the file given to this option is a FITS file, it is necessary to also
define the corresponding HDU/extension with @option{--catcolumnhdu}.
+Also note that no operation (such as row selection and arithmetic) is applied
to the table given to this option.
-@noindent
-However, other databases like VizieR host many more datasets (tens of
thousands!).
-Therefore it is very inconvenient to get the @emph{full} information every
time you want to find your dataset of interest (the full metadata file VizieR
is more than 20Mb).
-In such cases, you can limit the downloaded and displayed information with the
@code{--limitinfo} option.
-For example, with the first command below, you can get all datasets relating
to the MUSE (an instrument on the Very Large Telescope), and those that include
Roland Bacon (Principle Investigator of MUSE) as an author (@code{Bacon, R.}).
-Recall that @option{-i} is the short format of @option{--information}.
+If the appended columns have a name, and their name is already present in the
table before adding those columns, the column names of each file will be
appended with a @code{-N}, where @code{N} is a counter starting from 1 for each
appended table.
+Just note that in the FITS standard (and thus in Gnuastro), column names are
not case-sensitive.
-@example
-$ astquery vizier -i --limitinfo=MUSE
-$ astquery vizier -i --limitinfo="Bacon R."
-@end example
+This is done because when concatenating columns from multiple tables (more
than two) into one, they may have the same name, and it is not good practice to
have multiple columns with the same name.
+You can disable this feature with @option{--catcolumnrawname}.
+Generally, you can use the @option{--colmetadata} option to update column
metadata in the same command, after all the columns have been concatenated.
-Once you find the recognized name of your desired dataset, you can see the
column information of that dataset with adding the dataset name.
-For example, with the command below you can see the column metadata in the
@code{J/A+A/608/A2/udf10} dataset (one of the datasets in the search above)
using this command:
+For example, let's assume you have two catalogs of the same objects (same
number of rows) in different filters.
+Such that @file{f160w-cat.fits} has a @code{MAGNITUDE} column that has the
magnitude of each object in the @code{F160W} filter and similarly
@file{f105w-cat.fits}, also has a @code{MAGNITUDE} column, but for the
@code{F105W} filter.
+You can use column concatenation like below to import the @code{MAGNITUDE}
column from the @code{F105W} catalog into the @code{F160W} catalog, while
giving each magnitude column a different name:
@example
-$ astquery vizier --dataset=J/A+A/608/A2/udf10 -i
+asttable f160w-cat.fits --output=both.fits \
+ --catcolumnfile=f105w-cat.fits --catcolumns=MAGNITUDE \
+ --colmetadata=MAGNITUDE,MAG-F160W,log,"Magnitude in F160W" \
+ --colmetadata=MAGNITUDE-1,MAG-F105W,log,"Magnitude in F105W"
@end example
-@cindex SDSS DR12
-For very popular datasets of a database, Query provides an easier-to-remember
short name that you can feed to @option{--dataset}.
-This short name will map to the officially recognized name of the dataset on
the server.
-In this mode, Query will also set positional columns accordingly.
-For example, most VizieR datasets have an @code{RAJ2000} column (the RA and
the epoch of 2000) so it is the default RA column name for coordinate search
(using @option{--center} or @option{--overlapwith}).
-However, some datasets do not have this column (for example, SDSS DR12).
-So when you use the short name and Query knows about this dataset, it will
internally set the coordinate columns that SDSS DR12 has: @code{RA_ICRS} and
@code{DEC_ICRS}.
-Recall that you can always change the coordinate columns with @option{--ccol}.
-
-For example, in the VizieR and Gaia databases, the recognized name for data
release 3 data is respectively @code{I/355/gaiadr3} and
@code{gaiadr3.gaia_source}.
-These technical names are hard to remember.
-Therefore Query provides @code{gaiadr3} (for VizieR) and @code{dr3} (for ESA's
Gaia database) shortcuts which you can give to @option{--dataset} instead.
-They will be internally mapped to the fully recognized name by Query.
-In the list below that describes the available databases, the available short
names, that are recognized for each, are also listed.
+@noindent
+For a more complete example, see @ref{Working with catalogs estimating colors}.
@cartouche
@noindent
-@strong{Not all datasets support TAP:} Large databases like VizieR have TAP
access for all their datasets.
-However, smaller databases have not implemented TAP for all their tables.
-Therefore some datasets that are searchable in their web interface may not be
available for a TAP search.
-To see the full list of TAP-ed datasets in a database, use the
@option{--information} (or @option{-i}) option with the dataset name like the
command below.
+@strong{Loading external columns with Arithmetic:} an alternative way to load
external columns into your output is to use column arithmetic (@ref{Column
arithmetic})
+In particular the @option{load-col-} operator described in @ref{Loading
external columns}.
+But this operator will load only one column per file/HDU every time it is
called.
+So if you have many columns to insert, it is much faster to use
@option{--catcolumnfile}.
+Because @option{--catcolumnfile} will load all the columns in one opening of
the file, and possibly even read them all into memory in parallel!
+@end cartouche
+
+@item -u STR/INT
+@itemx --catcolumnhdu=STR/INT
+The HDU/extension of the FITS file(s) that should be concatenated, or
appended, by column with @option{--catcolumnfile}.
+If @option{--catcolumn} is called more than once with more than one FITS file,
it is necessary to call this option more than once.
+The HDUs will be loaded in the same order as the FITS files given to
@option{--catcolumnfile}.
+
+@item -C STR/INT
+@itemx --catcolumns=STR/INT
+The column(s) in the file(s) given to @option{--catcolumnfile} to append.
+When this option is not given, all the columns will be concatenated.
+See @option{--catcolumnfile} for more.
+
+@item --catcolumnrawname
+Do Not modify the names of the concatenated (appended) columns, see
description in @option{--catcolumnfile}.
+
+@item --transpose
+Transpose (as in a matrix) the given vector column(s) individually.
+When this operation is done (see @ref{Operation precedence in Table}), only
vector columns of the same data type and with the same number of elements
should exist in the table.
+A usage of this operator is presented in the IFU spectroscopy tutorial in
@ref{Extracting a single spectrum and plotting it}.
+
+As a generic example, see the commands below.
+The @file{in.txt} table below has two vector columns (each with three
elements) in two rows.
+After running @command{asttable} with @option{--transpose}, you can see how
the vector columns have two elements per row (@code{u8(3)} has been replaced by
@code{u8(2)}), and that the table now has three rows.
@example
-$ astquery astron -i
+$ cat in.txt
+# Column 1: abc [nounits,u8(3),] First vector column.
+# Column 2: def [nounits,u8(3),] Second vector column.
+111 112 113 211 212 213
+121 122 123 221 222 223
+
+$ asttable in.txt --transpose -O
+# Column 1: abc [nounits,u8(2),] First vector column.
+# Column 2: def [nounits,u8(2),] Second vector column.
+111 121 211 221
+112 122 212 222
+113 123 213 223
@end example
-@noindent
-If your desired dataset is not in this list, but has web-access, contact the
database maintainers and ask them to add TAP access for it.
-After they do it, you should see the name added to the output list of the
command above.
-@end cartouche
+@item --fromvector=STR,INT[,INT[,INT]]
+Extract the given tokens/elements from the given vector column into separate
single-valued columns.
+The input vector column can be identified by its name or counter, see
@ref{Selecting table columns}.
+After the columns are extracted, the input vector is deleted by default.
+To preserve the input vector column, you can use @option{--keepvectfin}
described below.
+For a complete usage scenario see @ref{Vector columns}.
-The list of databases recognized by Query (and their names in Query) is
described below.
-Since Query is a new member of the Gnuastro family (first available in
Gnuastro 0.14), this list will hopefully grow significantly in the next
releases.
-If you have any particular datasets in mind, please let us know by sending an
email to @code{bug-gnuastro@@gnu.org}.
-If the dataset supports IVOA's TAP (Table Access Protocol), it should be very
easy to add.
+@item --tovector=STR/INT,STR/INT[,STR/INT]
+Move the given columns into a newly created vector column.
+The given columns can be identified by their name or counter, see
@ref{Selecting table columns}.
+After the columns are copied, they are deleted by default.
+To preserve the inputs, you can use @option{--keepvectfin} described below.
+For a complete usage scenario see @ref{Vector columns}.
+@item -k
+@itemx --keepvectfin
+Do not delete the input column(s) when using @option{--fromvector} or
@option{--tovector}.
-@table @code
+@item -R FITS/TXT
+@itemx --catrowfile=FITS/TXT
+Add the rows of the given file to the output table.
+The selected columns in the tables given to this option should have the same
number and datatype and the rows before control reaches this phase (after
column selection and column concatenation), for more see @ref{Operation
precedence in Table}.
-@item astron
-@cindex ASTRON
-@cindex Radio astronomy
-The ASTRON Virtual Observatory service (@url{https://vo.astron.nl}) is a
database focused on radio astronomy data and images, primarily those collected
by ASTRON itself.
-A query to @code{astron} is submitted to
@code{https://vo.astron.nl/__system__/tap/run/tap/sync}.
+For example, if @file{a.fits}, @file{b.fits} and @file{c.fits} have the
columns @code{RA}, @code{DEC} and @code{MAGNITUDE} (possibly in different
column-numbers in their respective table, along with many more columns), the
command below will add their rows into the final output that will only have
these three columns:
-Here is the list of short names for dataset(s) in ASTRON's VO service:
-@itemize
-@item
-@code{tgssadr --> tgssadr.main}
-@end itemize
+@example
+$ asttable a.fits --catrowfile=b.fits --catrowhdu=1 \
+ --catrowfile=c.fits --catrowhdu=1 \
+ -cRA,DEC,MAGNITUDE --output=allrows.fits
+@end example
-@item gaia
-@cindex Gaia catalog
-@cindex Catalog, Gaia
-@cindex Database, Gaia
-The Gaia project (@url{https://www.cosmos.esa.int/web/gaia}) database which is
a large collection of star positions on the celestial sphere, as well as
peculiar velocities, parallaxes and magnitudes in some bands among many others.
-Besides scientific studies (like studying resolved stellar populations in the
Galaxy and its halo), Gaia is also invaluable for raw data calibrations, like
astrometry.
-A query to @code{gaia} is submitted to
@code{https://gea.esac.esa.int/tap-server/tap/sync}.
+@cartouche
+@cindex Provenance
+@noindent
+@strong{Provenance of each row:} When merging rows from separate catalogs, it
is important to keep track of the source catalog of each row (its provenance).
+To do this, you can use @option{--catrowfile} in combination with the
@code{constant} operator and @ref{Column arithmetic}.
+For a working example of this scenario, see the example within the
documentation of the @code{constant} operator in @ref{Building new dataset and
stack management}.
+@end cartouche
-Here is the list of short names for popular datasets within Gaia:
-@itemize
-@item
-@code{dr3 --> gaiadr3.gaia_source}
-@item
-@code{edr3 --> gaiaedr3.gaia_source}
-@item
-@code{dr2 --> gaiadr2.gaia_source}
-@item
-@code{dr1 --> gaiadr1.gaia_source}
-@item
-@code{tycho2 --> public.tycho2}
-@item
-@code{hipparcos --> public.hipparcos}
-@end itemize
+@cartouche
+@noindent
+@strong{How to avoid repetition when adding rows:} this option will simply add
the rows of multiple tables into one, it does not check their contents!
+Therefore if you use this option on multiple catalogs that may have some
shared physical objects in some of their rows, those rows/objects will be
repeated in the final table.
+In such scenarios, to avoid potential repetition, it is better to use
@ref{Match} (with @option{--notmatched} and @option{--outcols=AAA,BBB}) instead
of Table.
+For more on using Match for this scenario, see the description of
@option{--outcols} in @ref{Invoking astmatch}.
+@end cartouche
-@item ned
-@cindex NASA/IPAC Extragalactic Database (NED)
-@cindex NED (NASA/IPAC Extragalactic Database)
-The NASA/IPAC Extragalactic Database (NED, @url{http://ned.ipac.caltech.edu})
is a fusion database, integrating the information about extra-galactic sources
from many large sky surveys into a single catalog.
-It covers the full spectrum, from Gamma rays to radio frequencies and is
updated when new data arrives.
-A TAP query to @code{ned} is submitted to
@code{https://ned.ipac.caltech.edu/tap/sync}.
+@item -X STR
+@itemx --catrowhdu=STR
+The HDU/extension of the FITS file(s) that should be concatenated, or
appended, by rows with @option{--catrowfile}.
+If @option{--catrowfile} is called more than once with more than one FITS
file, it is necessary to call this option more than once also (once for every
FITS table given to @option{--catrowfile}).
+The HDUs will be loaded in the same order as the FITS files given to
@option{--catrowfile}.
-@itemize
-@item
-@code{objdir --> NEDTAP.objdir}: default TAP-based dataset in NED.
+@item -O
+@itemx --colinfoinstdout
+@cindex Standard output
+Add column metadata when the output is printed in the standard output.
+Usually the standard output is used for a fast visual check, or to pipe into
other metadata-agnostic programs (like AWK) for further processing.
+So by default meta-data are not included.
+But when piping to other Gnuastro programs (where metadata can be interpreted
and used) it is recommended to use this option and use column names in the next
program.
-@item
-@cindex VOTable
-@code{extinction}: A command-line interface to the
@url{https://ned.ipac.caltech.edu/extinction_calculator, NED Extinction
Calculator}.
-It only takes a central coordinate and returns a VOTable of the calculated
extinction in many commonly used filters at that point.
-As a result, options like @option{--width} or @option{--radius} are not
supported.
-However, Gnuastro does not yet support the VOTable format.
-Therefore, if you specify an @option{--output} file, it should have an
@file{.xml} suffix and the downloaded file will not be checked.
+@item -r STR,FLT:FLT
+@itemx --range=STR,FLT:FLT
+Only output rows that have a value within the given range in the @code{STR}
column (can be a name or counter).
+Note that the range is only inclusive in the lower-limit.
+For example, with @code{--range=sn,5:20} the output's columns will only
contain rows that have a value in the @code{sn} column (not case-sensitive)
that is greater or equal to 5, and less than 20.
+Also you can use the comma for separating the values such as this
@code{--range=sn,5,20}.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-Until VOTable support is added to Gnuastro, you can use GREP, AWK and SED to
convert the VOTable data into a FITS table with a command like below (assuming
the queried VOTable is called @file{ned-extinction.xml}):
+This option can be called multiple times (different ranges for different
columns) in one run of the Table program.
+This is very useful for selecting the final rows from multiple
criteria/columns.
-@verbatim
-grep '^<TR><TD>' ned-extinction.xml \
- | sed -e's|<TR><TD>||' \
- -e's|</TD></TR>||' \
- -e's|</TD><TD>|@|g' \
- | awk 'BEGIN{FS="@"; \
- print "# Column 1: FILTER [name,str15] Filter name"; \
- print "# Column 2: CENTRAL [um,f32] Central Wavelength"; \
- print "# Column 3: EXTINCTION [mag,f32] Galactic Ext."; \
- print "# Column 4: ADS_REF [ref,str50] ADS reference"} \
- {printf "%-15s %g %g %s\n", $1, $2, $3, $4}' \
- | asttable -oned-extinction.fits
-@end verbatim
+The chosen column does not have to be in the output columns.
+This is good when you just want to select using one column's values, but do
not need that column anymore afterwards.
-Once the table is in FITS, you can easily get the extinction for a certain
filter (for example, the @code{SDSS r} filter) like the command below:
+For one example of using this option, see the example under
+@option{--sigclip-median} in @ref{Invoking aststatistics}.
+
+@item --inpolygon=STR1,STR2
+Only return rows where the given coordinates are inside the polygon specified
by the @option{--polygon} option.
+The coordinate columns are the given @code{STR1} and @code{STR2} columns, they
can be a column name or counter (see @ref{Selecting table columns}).
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+
+Note that the chosen columns does not have to be in the output columns (which
are specified by the @code{--column} option).
+For example, if we want to select rows in the polygon specified in
@ref{Dataset inspection and cropping}, this option can be used like this (you
can remove the double quotations and write them all in one line if you remove
the white-spaces around the colon separating the column vertices):
@example
-asttable ned-extinction.fits --equal=FILTER,"SDSS r" \
- -cEXTINCTION
+asttable table.fits --inpolygon=RA,DEC \
+ --polygon="53.187414,-27.779152 \
+ : 53.159507,-27.759633 \
+ : 53.134517,-27.787144 \
+ : 53.161906,-27.807208" \
@end example
-@end itemize
-@item vizier
-@cindex VizieR
-@cindex CDS, VizieR
-@cindex Catalog, Vizier
-@cindex Database, VizieR
-Vizier (@url{https://vizier.u-strasbg.fr}) is arguably the largest catalog
database in astronomy: containing more than 20500 catalogs as of mid January
2021.
-Almost all published catalogs in major projects, and even the tables in many
papers are archived and accessible here.
-For example, VizieR also has a full copy of the Gaia database mentioned below,
with some additional standardized columns (like RA and Dec in J2000).
+@cartouche
+@noindent
+@strong{Flat/Euclidean space: } The @option{--inpolygon} option assumes a
flat/Euclidean space so it is only correct for RA and Dec when the polygon size
is very small like the example above.
+If your polygon is a degree or larger, it may not return correct results.
+Please get in touch if you need such a feature (see @ref{Suggest new feature}).
+@end cartouche
-The current implementation of @option{--limitinfo} only looks into the
description of the datasets, but since VizieR is so large, there is still a lot
of room for improvement.
-Until then, if @option{--limitinfo} is not sufficient, you can use VizieR's
own web-based search for your desired dataset:
@url{http://cdsarc.u-strasbg.fr/viz-bin/cat}
-
-
-Because VizieR curates such a diverse set of data from tens of thousands of
projects and aims for interoperability between them, the column names in VizieR
may not be identical to the column names in the surveys' own databases (Gaia in
the example above).
-A query to @code{vizier} is submitted to
@code{http://tapvizier.u-strasbg.fr/TAPVizieR/tap/sync}.
-
-@cindex 2MASS All-Sky Catalog
-@cindex AKARI/FIS All-Sky Survey
-@cindex AllWISE Data Release
-@cindex AAVSO Photometric All Sky Survey, DR9
-@cindex CatWISE 2020 catalog
-@cindex Dark Energy Survey data release 1
-@cindex GAIA Data Release (2 or 3)
-@cindex All-sky Survey of GALEX DR5
-@cindex Naval Observatory Merged Astrometric Dataset
-@cindex Pan-STARRS Data Release 1
-@cindex SDSS Photometric Catalogue, Release 12
-@cindex Whole-Sky USNO-B1.0 Catalog
-@cindex U.S. Naval Observatory CCD Astrograph Catalog
-@cindex Band-merged unWISE Catalog
-@cindex WISE All-Sky data Release
-Here is the list of short names for popular datasets within VizieR (sorted
alphabetically by their short name).
-Please feel free to suggest other major catalogs (covering a wide area or
commonly used in your field)..
-For details on each dataset with necessary citations, and links to web pages,
look into their details with their ViziR names in
@url{https://vizier.u-strasbg.fr/viz-bin/VizieR}.
-@itemize
-@item
-@code{2mass --> II/246/out} (2MASS All-Sky Catalog)
-@item
-@code{akarifis --> II/298/fis} (AKARI/FIS All-Sky Survey)
-@item
-@code{allwise --> II/328/allwise} (AllWISE Data Release)
-@item
-@code{apass9 --> II/336/apass9} (AAVSO Photometric All Sky Survey, DR9)
-@item
-@code{catwise --> II/365/catwise} (CatWISE 2020 catalog)
-@item
-@code{des1 --> II/357/des_dr1} (Dark Energy Survey data release 1)
-@item
-@code{gaiadr3 --> I/355/gaiadr3} (GAIA Data Release 3)
-@item
-@code{gaiaedr3 --> I/350/gaiadr3} (GAIA Early Data Release 3)
-@item
-@code{gaiadr2 --> I/345/gaia2} (GAIA Data Release 2)
-@item
-@code{galex5 --> II/312/ais} (All-sky Survey of GALEX DR5)
-@item
-@code{nomad --> I/297/out} (Naval Observatory Merged Astrometric Dataset)
-@item
-@code{panstarrs1 --> II/349/ps1} (Pan-STARRS Data Release 1).
-@item
-@code{ppmxl --> I/317/sample} (Positions and proper motions on the ICRS)
-@item
-@code{sdss12 --> V/147/sdss12} (SDSS Photometric Catalogue, Release 12)
-@item
-@code{usnob1 --> I/284/out} (Whole-Sky USNO-B1.0 Catalog)
-@item
-@code{ucac5 --> I/340/ucac5} (5th U.S. Naval Obs. CCD Astrograph Catalog)
-@item
-@code{unwise --> II/363/unwise} (Band-merged unWISE Catalog)
-@item
-@code{wise --> II/311/wise} (WISE All-Sky data Release)
-@end itemize
-@end table
+@item --outpolygon=STR1,STR2
+Only return rows where the given coordinates are outside the polygon specified
by the @option{--polygon} option.
+This option is very similar to the @option{--inpolygon} option, so see the
description there for more.
+@item --polygon=STR
+@itemx --polygon=FLT,FLT:FLT,FLT:...
+The polygon to use for the @code{--inpolygon} and @option{--outpolygon}
options.
+This option is parsed in an identical way to the same option in the Crop
program, so for more information on how to use it, see @ref{Crop options}.
+@item -e STR,INT/FLT,...
+@itemx --equal=STR,INT/FLT,...
+Only output rows that are equal to the given number(s) in the given column.
+The first argument is the column identifier (name or number, see
@ref{Selecting table columns}), after that you can specify any number of values.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
+For example, @option{--equal=ID,5,6,8} will only print the rows that have a
value of 5, 6, or 8 in the @code{ID} column.
+This option can also be called multiple times, so @option{--equal=ID,4,5
--equal=ID,6,7} has the same effect as @option{--equal=4,5,6,7}.
-@node Invoking astquery, , Available databases, Query
-@subsection Invoking Query
+@cartouche
+@noindent
+@strong{Equality and floating point numbers:} Floating point numbers are only
approximate values (see @ref{Numeric data types}).
+In this context, their equality depends on how the input table was originally
stored (as a plain text table or as an ASCII/binary FITS table).
+If you want to select floating point numbers, it is strongly recommended to
use the @option{--range} option and set a very small interval around your
desired number, do not use @option{--equal} or @option{--notequal}.
+@end cartouche
-Query provides a high-level interface to downloading subsets of data from
databases.
-The executable name is @file{astquery} with the following general template
+The @option{--equal} and @option{--notequal} options also work when the given
column has a string type.
+In this case the given value to the option will also be parsed as a string,
not as a number.
+When dealing with string columns, be careful with trailing white space
characters (the actual value maybe adjusted to the right, left, or center of
the column's width).
+If you need to account for such white spaces, you can use shell quoting.
+For example, @code{--equal=NAME," myname "}.
+@cartouche
+@noindent
+@strong{Strings with a comma (,):} When your desired column values contain a
comma, you need to put a `@code{\}' before the internal comma (within the
value).
+Otherwise, the comma will be interpreted as a delimiter between multiple
values, and anything after it will be interpreted as a separate string.
+For example, assume column @code{AB} of your @file{table.fits} contains this
value: `@code{cd,ef}' in your desired rows.
+To extract those rows, you should use the command below:
@example
-$ astquery DATABASE-NAME [OPTION...] ...
+$ asttable table.fits --equal=AB,cd\,ef
@end example
+@end cartouche
-@noindent
-One line examples:
+@item -n STR,INT/FLT,...
+@itemx --notequal=STR,INT/FLT,...
+Only output rows that are @emph{not} equal to the given number(s) in the given
column.
+The first argument is the column identifier (name or number, see
@ref{Selecting table columns}), after that you can specify any number of values.
+For example, @option{--notequal=ID,5,6,8} will only print the rows where the
@code{ID} column does not have value of 5, 6, or 8.
+This option can also be called multiple times, so @option{--notequal=ID,4,5
--notequal=ID,6,7} has the same effect as @option{--notequal=4,5,6,7}.
-@example
+Be very careful if you want to use the non-equality with floating point
numbers, see the special note under @option{--equal} for more.
+This option also works when the given column has a string type, see the
description under @option{--equal} (above) for more.
-## Information about all datasets in ESA's GAIA database:
-$ astquery gaia --information
+@item -b STR[,STR[,STR]]
+@itemx --noblank=STR[,STR[,STR]]
+Only output rows that are @emph{not} blank in the given column of the
@emph{input} table.
+Like above, the columns can be specified by their name or number (counting
from 1).
+This option can be called multiple times, so @option{--noblank=MAG
--noblank=PHOTOZ} is equivalent to @option{--noblank=MAG,PHOTOZ}.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-## Only show catalogs in VizieR that have 'MUSE' in their
-## description. The '-i' is short for '--information'.
-$ astquery vizier -i --limitinfo=MUSE
+For example, if @file{table.fits} has blank values (NaN in floating point
types) in the @code{magnitude} and @code{sn} columns, with
@code{--noblank=magnitude,sn}, the output will not contain any rows with blank
values in these two columns.
-## List of columns in 'J/A+A/608/A2/udf10' (one of the above).
-$ astquery vizier --dataset=J/A+A/608/A2/udf10 -i
+If you want @emph{all} columns to be checked, simply set the value to
@code{_all} (in other words: @option{--noblank=_all}).
+This mode is useful when there are many columns in the table and you want a
``clean'' output table (with no blank values in any column): entering their
name or number one-by-one can be buggy and frustrating.
+In this mode, no other column name should be given.
+For example, if you give @option{--noblank=_all,magnitude}, then Table will
assume that your table actually has a column named @code{_all} and
@code{magnitude}, and if it does not, it will abort with an error.
-## ID, RA and Dec of all Gaia sources within an image.
-$ astquery gaia --dataset=dr3 --overlapwith=image.fits \
- -csource_id,ra,dec
+If you want to change column values using @ref{Column arithmetic} (and set
some to blank, to later remove), or you want to select rows based on columns
that you have imported from other tables, you should use the
@option{--noblankend} option described below.
+Also, see @ref{Operation precedence in Table}.
-## RA, Dec and Spectroscopic redshifts of objects in SDSS DR12
-## spectroscopic redshift that overlap with 'image.fits'.
-$ astquery vizier --dataset=sdss12 --overlapwith=image.fits \
- -cRA_ICRS,DE_ICRS,zsp --range=zsp,1e-10,inf
+@item -s STR
+@itemx --sort=STR
+Sort the output rows based on the values in the @code{STR} column (can be a
column name or number).
+By default the sort is done in ascending/increasing order, to sort in a
descending order, use @option{--descending}.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-## All columns of all entries in the Gaia DR3 catalog (hosted at
-## VizieR) within 1 arc-minute of the given coordinate.
-$ astquery vizier --dataset=gaiadr3 --output=my-gaia.fits \
- --center=113.8729761,31.9027152 --radius=1/60 \
+The chosen column does not have to be in the output columns.
+This is good when you just want to sort using one column's values, but do not
need that column anymore afterwards.
-## Similar to above, but only ID, RA and Dec columns for objects with
-## magnitude range 10 to 15. In VizieR, this column is called 'Gmag'.
-## Also, using sexagesimal coordinates instead of degrees for center.
-$ astquery vizier --dataset=gaiadr3 --output=my-gaia.fits \
- --center=07h35m29.51,31d54m9.77 --radius=1/60 \
- --range=Gmag,10:15 -cDR3Name,RAJ2000,DEJ2000
-@end example
+@item -d
+@itemx --descending
+When called with @option{--sort}, rows will be sorted in descending order.
-Query takes a single argument which is the name of the database.
-For the full list of available databases and accessing them, see
@ref{Available databases}.
-There are two methods to query the databases, each is more fully discussed in
its option's description below.
-@itemize
-@item
-@strong{Low-level:}
-With @option{--query} you can directly give a raw query statement that is
recognized by the database.
-This is very low level and will require a good knowledge of the database's
query language, but of course, it is much more powerful.
-If this option is given, the raw string is directly passed to the server and
all other constraints/options (for Query's high-level interface) are ignored.
-@item
-@strong{High-level:}
-With the high-level options (like @option{--column}, @option{--center},
@option{--radius}, @option{--range} and other constraining options below), the
low-level query will be constructed automatically for the particular database.
-This method is only limited to the generic capabilities that Query provides
for all servers.
-So @option{--query} is more powerful, however, in this mode, you do not need
any knowledge of the database's query language.
-You can see the internally generated query on the terminal (if
@option{--quiet} is not used) or in the 0-th extension of the output (if it is
a FITS file).
-This full command contains the internally generated query.
-@end itemize
+@item -H INT
+@itemx --head=INT
+Only print the given number of rows from the @emph{top} of the final table.
+Note that this option only affects the @emph{output} table.
+For example, if you use @option{--sort}, or @option{--range}, the printed rows
are the first @emph{after} applying the sort sorting, or selecting a range of
the full input.
+This option cannot be called with @option{--tail}, @option{--rowrange} or
@option{--rowrandom}.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-The name of the downloaded output file can be set with @option{--output}.
-The requested output format can have any of the @ref{Recognized table formats}
(currently @file{.txt} or @file{.fits}).
-Like all Gnuastro programs, if the output is a FITS file, the zero-th/first
HDU of the output will contain all the command-line options given to Query as
well as the full command used to access the server.
-When @option{--output} is not set, the output name will be in the format of
@file{NAME-STRING.fits}, where @file{NAME} is the name of the database and
@file{STRING} is a randomly selected 6-character set of numbers and alphabetic
characters.
-With this feature, a second run of @command{astquery} that is not called with
@option{--output} will not over-write an already downloaded one.
-Generally, when calling Query more than once, it is recommended to set an
output name for each call based on your project's context.
+@cindex GNU Coreutils
+If the given value to @option{--head} is 0, the output columns will not have
any rows and if it is larger than the number of rows in the input table, all
the rows are printed (this option is effectively ignored).
+This behavior is taken from the @command{head} program in GNU Coreutils.
-The outputs of Query will have a common output format, irrespective of the
used database.
-To achieve this, Query will ask the databases to provide a FITS table output
(for larger tables, FITS can consume much less download volume).
-After downloading is complete, the raw downloaded file will be read into
memory once by Query, and written into the file given to @option{--output}.
-The raw downloaded file will be deleted by default, but can be preserved with
the @option{--keeprawdownload} option.
-This strategy avoids unnecessary surprises depending on database.
-For example, some databases can download a compressed FITS table, even though
we ask for FITS.
-But with the strategy above, the final output will be an uncompressed FITS
file.
-The metadata that is added by Query (including the full download command) is
also very useful for future usage of the downloaded data.
-Unfortunately many databases do not write the input queries into their
generated tables.
+@item -t INT
+@itemx --tail=INT
+Only print the given number of rows from the @emph{bottom} of the final table.
+See @option{--head} for more.
+This option cannot be called with @option{--head}, @option{--rowrange} or
@option{--rowrandom}.
-@table @option
+@item --rowrange=INT,INT
+Only return the rows within the requested positional range (inclusive on both
sides).
+Therefore, @code{--rowrange=5,7} will return 3 of the input rows, row 5, 6 and
7.
+This option will abort if any of the given values is larger than the total
number of rows in the table.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-@item --dry-run
-Only print the final download command to contact the server, do not actually
run it.
-This option is good when you want to check the finally constructed query or
download options given to the download program.
-You may also want to use the constructed command as a base to do further
customizations on it and run it yourself.
+With the @option{--head} or @option{--tail} options you can only see the top
or bottom few rows.
+However, with this option, you can limit the returned rows to a contiguous set
of rows in the middle of the table.
+Therefore this option cannot be called with @option{--head}, @option{--tail},
or @option{--rowrandom}.
-@item -k
-@itemx --keeprawdownload
-Do Not delete the raw downloaded file from the database.
-The name of the raw download will have a @file{OUTPUT-raw-download.fits}
format.
-Where @file{OUTPUT} is either the base-name of the final output file (without
a suffix).
+@item --rowrandom=INT
+@cindex Random row selection
+@cindex Row selection, by random
+Select @code{INT} rows from the input table by random (assuming a uniform
distribution).
+This option is applied @emph{after} the value-based selection options (such as
@option{--sort}, @option{--range}, and @option{--polygon}).
+On the other hand, only the row counters are randomly selected, this option
does not change the order.
+Therefore, if @option{--rowrandom} is called together with @option{--sort},
the returned rows are still sorted.
+This option cannot be called with @option{--head}, @option{--tail}, or
@option{--rowrange}.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-@item -i
-@itemx --information
-Print the information of all datasets (tables) within a database or all
columns within a database.
-When @option{--dataset} is specified, the latter mode (all column information)
is downloaded and printed and when it is not defined, all dataset information
(within the database) is printed.
+This option will only have an effect if @code{INT} is larger than the number
of rows when it is activated (after the value-based selection options have been
applied).
+When there are fewer rows, a warning is printed, saying that this option has
no effect.
+The warning can be disabled with the @option{--quiet} option.
-Some databases (like VizieR) contain tens of thousands of datasets, so you can
limit the downloaded and printed information for available databases with the
@option{--limitinfo} option (described below).
-Dataset descriptions are often large and contain a lot of text (unlike column
descriptions).
-Therefore when printing the information of all datasets within a database, the
information (e.g., database name) will be printed on separate lines before the
description.
-However, when printing column information, the output has the same format as a
similar option in Table (see @ref{Invoking asttable}).
+@cindex Reproducibility
+Due to its nature (to be random), the output of this option differs in each
run.
+Therefore 5 calls to Table with @option{--rowrandom} on the same input table
will generate 5 different outputs.
+If you want a reproducible random selection, set the @code{GSL_RNG_SEED}
environment variable and also use the @option{--envseed} option, for more see
@ref{Generating random numbers}.
-Important note to consider: the printed order of the datasets or columns is
just for displaying in the printed output.
-You cannot ask for datasets or columns based on the printed order, you need to
use dataset or column names.
+@item --envseed
+Read the random number generator seed from the @code{GSL_RNG_SEED} environment
variable for @option{--rowrandom} (instead of generating a different seed
internally on every run).
+This is useful if you want a reproducible random selection of the input rows.
+For more, see @ref{Generating random numbers}.
-@item -L STR
-@itemx --limitinfo=STR
-Limit the information that is downloaded and displayed (with
@option{--information}) to those that have the string given to this option in
their description.
-Note that @emph{this is case-sensitive}.
-This option is only relevant when @option{--information} is also called.
+@item -E STR[,STR[,STR]]
+@itemx --noblankend=STR[,STR[,STR]]
+Remove all rows in the requested @emph{output} columns that have a blank value.
+Like above, the columns can be specified by their name or number (counting
from 1).
+This option can be called multiple times, so @option{--noblank=MAG
--noblank=PHOTOZ} is equivalent to @option{--noblank=MAG,PHOTOZ}.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-Databases may have thousands (or tens of thousands) of datasets.
-Therefore just the metadata (information) to show with @option{--information}
can be tens of megabytes (for example, the full VizieR metadata file is about
23Mb as of January 2021).
-Once downloaded, it can also be hard to parse manually.
-With @option{--limitinfo}, only the metadata of datasets that contain this
string @emph{in their description} will be downloaded and displayed, greatly
improving the speed of finding your desired dataset.
+for example, if your final output table (possibly after column arithmetic, or
adding new columns) has blank values (NaN in floating point types) in the
@code{magnitude} and @code{sn} columns, with @code{--noblankend=magnitude,sn},
the output will not contain any rows with blank values in these two columns.
-@item -Q "STR"
-@itemx --query="STR"
-Directly specify the query to be passed onto the database.
-The queries will generally contain space and other meta-characters, so we
recommend placing the query within quotations.
+If you want blank values to be removed from the main input table _before_ any
further processing (like adding columns, sorting or column arithmetic), you
should use the @option{--noblank} option.
+With the @option{--noblank} option, the column(s) that is(are) given does not
necessarily have to be in the output (it is just temporarily used for reading
the inputs and selecting rows, but does not necessarily need to be present in
the output).
+However, the column(s) given to this option should exist in the output.
-@item -s STR
-@itemx --dataset=STR
-The dataset to query within the database (not compatible with
@option{--query}).
-This option is mandatory when @option{--query} or @option{--information} are
not provided.
-You can see the list of available datasets within a database using
@option{--information} (possibly supplemented by @option{--limitinfo}).
-The output of @option{--information} will contain the recognized name of the
datasets within that database.
-You can pass the recognized name directly to this option.
-For more on finding and using your desired database, see @ref{Available
databases}.
+If you want @emph{all} columns to be checked, simply set the value to
@code{_all} (in other words: @option{--noblankend=_all}).
+This mode is useful when there are many columns in the table and you want a
``clean'' output table (with no blank values in any column): entering their
name or number one-by-one can be buggy and frustrating.
+In this mode, no other column name should be given.
+For example, if you give @option{--noblankend=_all,magnitude}, then Table will
assume that your table actually has a column named @code{_all} and
@code{magnitude}, and if it does not, it will abort with an error.
-@item -c STR
-@itemx --column=STR[,STR[,...]]
-The column name(s) to retrieve from the dataset in the given order (not
compatible with @option{--query}).
-If not given, all the dataset's columns for the selected rows will be queried
(which can be large!).
-This option can take multiple values in one instance (for example,
@option{--column=ra,dec,mag}), or in multiple instances (for example,
@option{-cra -cdec -cmag}), or mixed (for example, @option{-cra,dec -cmag}).
+This option is applied just before writing the final table (after
@option{--colmetadata} has finished).
+So in case you changed the column metadata, or added new columns, you can use
the new names, or the newly defined column numbers.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-In case, you do not know the full list of the dataset's column names a-priori,
and you do not want to download all the columns (which can greatly decrease
your download speed), you can use the @option{--information} option combined
with the @option{--dataset} option, see @ref{Available databases}.
+@item -m STR/INT,STR[,STR[,STR]]
+@itemx --colmetadata=STR/INT,STR[,STR[,STR]]
+Update the specified column metadata in the output table.
+This option is applied after all other column-related operations are complete,
for example, column arithmetic, or column concatenation.
+For the precedence of this operation in relation to others, see @ref{Operation
precedence in Table}.
-@item -H INT
-@itemx --head=INT
-Only ask for the first @code{INT} rows of the finally selected columns, not
all the rows.
-This can be good when your search can result a large dataset, but before
downloading the full volume, you want to see the top rows and get a feeling of
what the whole dataset looks like.
+The first value (before the first comma) given to this option is the column's
identifier.
+It can either be a counter (positive integer, counting from 1), or a name (the
column's name in the output if this option was not called).
-@item -v FITS
-@itemx --overlapwith=FITS
-File name of FITS file containing an image (in the HDU given by
@option{--hdu}) to use for identifying the region to query in the give database
and dataset.
-Based on the image's WCS and pixel size, the sky coverage of the image is
estimated and values to the @option{--center}, @option{--width} will be
calculated internally.
-Hence this option cannot be used with @code{--center}, @code{--width} or
@code{--radius}.
-Also, since it internally generates the query, it cannot be used with
@code{--query}.
+After the to-be-updated column is identified, at least one other string should
be given, with a maximum of three strings.
+The first string after the original name will the selected column's new name.
+The next (optional) string will be the selected column's unit and the third
(optional) will be its comments.
+If the two optional strings are not given, the original column's units or
comments will remain unchanged.
-Note that if the image has WCS distortions and the reference point for the WCS
is not within the image, the WCS will not be well-defined.
-Therefore the resulting catalog may not overlap, or correspond to a
larger/small area in the sky.
+If any of the values contains a comma, you should place a `@code{\}' before
the comma to avoid it getting confused with a delimiter.
+For example, see the command below for a column description that contains a
comma:
-@item -C FLT,FLT
-@itemx --center=FLT,FLT
-The spatial center position (mostly RA and Dec) to use for the automatically
generated query (not compatible with @option{--query}).
-The comma-separated values can either be in degrees (a single number), or
sexagesimal (@code{_h_m_} for RA, @code{_d_m_} for Dec, or @code{_:_:_} for
both).
+@example
+$ asttable table.fits \
+ --colmetadata=NAME,UNIT,"Comments\, with a comma"
+@end example
-The given values will be compared to two columns in the database to
find/return rows within a certain region around this center position will be
requested and downloaded.
-Pre-defined RA and Dec column names are defined in Query for every database,
however you can use @option{--ccol} to select other columns to use instead.
-The region can either be a circle and the point (configured with
@option{--radius}) or a box/rectangle around the point (configured with
@option{--width}).
+Generally, since the comma is commonly used as a delimiter in many scenarios,
to avoid complicating your future analysis with the table, it is best to avoid
using a comma in the column name and units.
-@item --ccol=STR,STR
-The name of the coordinate-columns in the dataset to compare with the values
given to @option{--center}.
-Query will use its internal defaults for each dataset (for example,
@code{RAJ2000} and @code{DEJ2000} for VizieR data).
-But each dataset is treated separately and it is not guaranteed that these
columns exist in all datasets.
-Also, more than one coordinate system/epoch may be present in a dataset and
you can use this option to construct your spatial constraint based on the
others coordinate systems/epochs.
+Some examples of this option are available in the tutorials, in particular
@ref{Working with catalogs estimating colors}.
+Here are some more specific examples:
-@item -r FLT
-@itemx --radius=FLT
-The radius about the requested center to use for the automatically generated
query (not compatible with @option{--query}).
-The radius is in units of degrees, but you can use simple division with this
option directly on the command-line.
-For example, if you want a radius of 20 arc-minutes or 20 arc-seconds, you can
use @option{--radius=20/60} or @option{--radius=20/3600} respectively (which is
much more human-friendly than @code{0.3333} or @code{0.005556}).
+@table @option
-@item -w FLT[,FLT]
-@itemx --width=FLT[,FLT]
-The square (or rectangle) side length (width) about the requested center to
use for the automatically generated query (not compatible with
@option{--query}).
-If only one value is given to @code{--width} the region will be a square, but
if two values are given, the widths of the query box along each dimension will
be different.
-The value(s) is (are) in the same units as the coordinate column (see
@option{--ccol}, usually RA and Dec which are degrees).
-You can use simple division for each value directly on the command-line if you
want relatively small (and more human-friendly) sizes.
-For example, if you want your box to be 1 arc-minutes along the RA and 2
arc-minutes along Dec, you can use @option{--width=1/60,2/60}.
+@item --colmetadata=MAGNITUDE,MAG_F160W
+This will convert name of the original @code{MAGNITUDE} column to
@code{MAG_F160W}, leaving the unit and comments unchanged.
-@item -g STR,FLT,FLT
-@itemx --range=STR,FLT,FLT
-The column name and numerical range (inclusive) of acceptable values in that
column (not compatible with @option{--query}).
-This option can be called multiple times for applying range limits on many
columns in one call (thus greatly reducing the download size).
-For example, when used on the ESA gaia database, you can use
@code{--range=phot_g_mean_mag,10:15} to only get rows that have a value between
10 and 15 (inclusive on both sides) in the @code{phot_g_mean_mag} column.
+@item --colmetadata=3,MAG_F160W,mag
+This will convert name of the third column of the final output to
@code{MAG_F160W} and the units to @code{mag}, while leaving the comments
untouched.
-If you want all rows larger, or smaller, than a certain number, you can use
@code{inf}, or @code{-inf} as the first or second values respectively.
-For example, if you want objects with SDSS spectroscopic redshifts larger than
2 (from the VizieR @code{sdss12} database), you can use
@option{--range=zsp,2,inf}
+@item --colmetadata=MAGNITUDE,MAG_F160W,mag,"Magnitude in F160W filter"
+This will convert name of the original @code{MAGNITUDE} column to
@code{MAG_F160W}, and the units to @code{mag} and the comments to
@code{Magnitude in F160W filter}.
+Note the double quotations around the comment string, they are necessary to
preserve the white-space characters within the column comment from the
command-line, into the program (otherwise, upon reaching a white-space
character, the shell will consider this option to be finished and cause
un-expected behavior).
+@end table
-If you want the interval to not be inclusive on both sides, you can run
@code{astquery} once and get the command that it executes.
-Then you can edit it to be non-inclusive on your desired side.
+If your table is large and generated by a script, you can first do all your
operations on your table's data and write it into a temporary file (maybe
called @file{temp.fits}).
+Then, look into that file's metadata (with @command{asttable temp.fits -i}) to
see the exact column positions and possible names, then add the necessary calls
to this option to your previous call to @command{asttable}, so it writes proper
metadata in the same run (for example, in a script or Makefile).
+Recall that when a name is given, this option will update the metadata of the
first column that matches, so if you have multiple columns with the same name,
you can call this options multiple times with the same first argument to change
them all to different names.
-@item -b STR[,STR]
-@item --noblank=STR[,STR]
-Only ask for rows that do not have a blank value in the @code{STR} column.
-This option can be called many times, and each call can have multiple column
names (separated by a comma or @key{,}).
-For example, if you want the retrieved rows to not have a blank value in
columns @code{A}, @code{B}, @code{C} and @code{D}, you can use
@command{--noblank=A -bB,C,D}.
+Finally, if you already have a FITS table by other means (for example, by
downloading) and you merely want to update the column metadata and leave the
data intact, it is much more efficient to directly modify the respective FITS
header keywords with @code{astfits}, using the keyword manipulation features
described in @ref{Keyword inspection and manipulation}.
+@option{--colmetadata} is mainly intended for scenarios where you want to edit
the data so it will always load the full/partial dataset into memory, then
write out the resulting datasets with updated/corrected metadata.
-@item --sort=STR[,STR]
-Ask for the server to sort the downloaded data based on the given columns.
-For example, let's assume your desired catalog has column @code{Z} for
redshift and column @code{MAG_R} for magnitude in the R band.
-When you call @option{--sort=Z,MAG_R}, it will primarily sort the columns
based on the redshift, but if two objects have the same redshift, they will be
sorted by magnitude.
-You can add as many columns as you like for higher-level sorting.
+
+@item -f STR
+@itemx --txtf32format=STR
+The plain-text format of 32-bit floating point columns when output is not
binary (this option is ignored for binary outputs like FITS tables, see
@ref{Printing floating point numbers}).
+The acceptable values are listed below.
+This is just the format of the plain-text outputs; see
@option{--txtf32precision} for customizing their precision.
+@table @code
+@item fixed
+Fixed-point notation (for example @code{123.4567}).
+@item exp
+Exponential notation (for example @code{1.234567e+02}).
@end table
+The default mode is @code{exp} since it is the most generic and will not cause
any loss of data.
+Be very cautious if you set it to @code{fixed}.
+As a rule of thumb, the fixed-point notation is only good if the numbers are
larger than 1.0, but not too large!
+Given that the total number of accurate decimal digits is fixed the more
digits you have on the left of the decimal point (integer part), the more
un-accurate digits will be printed on the right of the decimal point.
+@item -p STR
+@itemx --txtf32precision=INT
+Number of digits after (to the right side of) the decimal point (precision)
for columns with a 32-bit floating point datatype (this option is ignored for
binary outputs like FITS tables, see @ref{Printing floating point numbers}).
+This can take any positive integer (including 0).
+When given a value of zero, the floating point number will be rounded to the
nearest integer.
+@cindex IEEE 754
+The default value to this option is 6.
+This is because according to IEEE 754, 32-bit floating point numbers can be
accurately presented to 7.22 decimal digits (see @ref{Printing floating point
numbers}).
+Since we only have an integer number of digits in a number, we'll round it to
7 decimal digits.
+Furthermore, the precision is only defined to the right side of the decimal
point.
+In exponential notation (default of @option{--txtf32format}), one decimal
digit will be printed on the left of the decimal point.
+So the default value to this option is @mymath{7-1=6}.
+@item -A STR
+@itemx --txtf64format=STR
+The plain-text format of 64-bit floating point columns when output is not
binary (this option is ignored for binary outputs like FITS tables, see
@ref{Printing floating point numbers}).
+The acceptable values are listed below.
+This is just the format of the plain-text outputs; see
@option{--txtf64precision} for customizing their precision.
+@table @code
+@item fixed
+Fixed-point notation (for example @code{12345.6789012345}).
+@item exp
+Exponential notation (for example @code{1.23456789012345e4}).
+@end table
+The default mode is @code{exp} since it is the most generic and will not cause
any loss of data.
+Be very cautious if you set it to @code{fixed}.
+As a rule of thumb, the fixed-point notation is only good if the numbers are
larger than 1.0, but not too large!
+Given that the total number of accurate decimal digits is fixed the more
digits you have on the left of the decimal point (integer part), the more
un-accurate digits will be printed on the right of the decimal point.
+@item -B STR
+@itemx --txtf64precision=INT
+Number of digits after the decimal point (precision) for columns with a 64-bit
floating point datatype (this option is ignored for binary outputs like FITS
tables, see @ref{Printing floating point numbers}).
+This can take any positive integer (including 0).
+When given a value of zero, the floating point number will be rounded to the
nearest integer.
+@cindex IEEE 754
+The default value to this option is 15.
+This is because according to IEEE 754, 64-bit floating point numbers can be
accurately presented to 15.95 decimal digits (see @ref{Printing floating point
numbers}).
+Since we only have an integer number of digits in a number, we'll round it to
16 decimal digits.
+Furthermore, the precision is only defined to the right side of the decimal
point.
+In exponential notation (default of @option{--txtf64format}), one decimal
digit will be printed on the left of the decimal point.
+So the default value to this option is @mymath{16-1=15}.
+@item -Y
+@itemx --txteasy
+When output is a plain-text file or just gets printed on standard output (the
terminal), all floating point columns are printed in fixed point notation (as
in @code{123.456}) instead of the default exponential notation (as in
@code{1.23456e+02}).
+For 32-bit floating points, this option will use a precision of 3 digits (see
@option{--txtf32precision}) and for 64-bit floating points use a precision of 6
digits (see @option{--txtf64precision}).
+This can be useful for human readability, but be careful with some scenarios
(for example @code{1.23e-120}, which will show only as @code{0.0}!).
+When this option is called any value given the following options is ignored:
@option{--txtf32format}, @option{--txtf32precision}, @option{--txtf64format}
and @option{--txtf64precision}.
+For example below you can see the output of table with and without this option:
+@example
+$ asttable table.fits --head=5 -O
+# Column 1: OBJNAME [name ,str23, ] Name in HyperLeda.
+# Column 2: RAJ2000 [deg ,f64 , ] Right Ascension.
+# Column 3: DEJ2000 [deg ,f64 , ] Declination.
+# Column 4: RADIUS [arcmin,f32 , ] Major axis radius.
+NGC0884 2.3736267000000e+00 5.7138753300000e+01 8.994357e+00
+NGC1629 4.4935191000000e+00 -7.1838322400000e+01 5.000000e-01
+NGC1673 4.7109672000000e+00 -6.9820892700000e+01 3.499210e-01
+NGC1842 5.1216920000000e+00 -6.7273195300000e+01 3.999171e-01
+$ asttable table.fits --head=5 -O -Y
+# Column 1: OBJNAME [name ,str23, ] Name in HyperLeda.
+# Column 2: RAJ2000 [deg ,f64 , ] Right Ascension.
+# Column 3: DEJ2000 [deg ,f64 , ] Declination.
+# Column 4: RADIUS [arcmin,f32 , ] Major axis radius.
+NGC0884 2.373627 57.138753 8.994
+NGC1629 4.493519 -71.838322 0.500
+NGC1673 4.710967 -69.820893 0.350
+NGC1842 5.121692 -67.273195 0.400
+@end example
+This is also useful when you want to make outputs of other programs more
``easy'' to read, for example:
+@example
+$ echo 123.45678 | asttable
+1.234567800000000e+02
+$ echo 123.45678 | asttable -Y
+123.456780
+@end example
+@cartouche
+@noindent
+@strong{Can result in loss of information}: be very careful with this option!
+It can loose precision or generally the full value if the value is not within
a "good" range like this example.
+Such cases are the reason that this is not the default format of plain-text
outputs.
+@example
+$ echo 123.4e-9 | asttable -Y
+0.000000
+@end example
+@end cartouche
+@end table
-@node Data manipulation, Data analysis, Data containers, Top
-@chapter Data manipulation
-Images are one of the major formats of data that is used in astronomy.
-The functions in this chapter explain the GNU Astronomy Utilities which are
provided for their manipulation.
-For example, cropping out a part of a larger image or convolving the image
with a given kernel or applying a transformation to it.
-@menu
-* Crop:: Crop region(s) from a dataset.
-* Arithmetic:: Arithmetic on input data.
-* Convolve:: Convolve an image with a kernel.
-* Warp:: Warp/Transform an image to a different grid.
-@end menu
-@node Crop, Arithmetic, Data manipulation, Data manipulation
-@section Crop
-@cindex Section of an image
-@cindex Crop part of image
-@cindex Postage stamp images
-@cindex Large astronomical images
-@pindex @r{Crop (}astcrop@r{)}
-Astronomical images are often very large, filled with thousands of galaxies.
-It often happens that you only want a section of the image, or you have a
catalog of sources and you want to visually analyze them in small postage
stamps.
-Crop is made to do all these things.
-When more than one crop is required, Crop will divide the crops between
multiple threads to significantly reduce the run time.
-@cindex Mosaicing
-@cindex Image tiles
-@cindex Image mosaic
-@cindex COSMOS survey
-@cindex Imaging surveys
-@cindex Hubble Space Telescope (HST)
-Astronomical surveys are usually extremely large.
-So large in fact, that the whole survey will not fit into a reasonably sized
file.
-Because of this, surveys usually cut the final image into separate tiles and
store each tile in a file.
-For example, the COSMOS survey's Hubble space telescope, ACS F814W image
consists of 81 separate FITS images, with each one having a volume of 1.7
Gigabytes.
-@cindex Stitch multiple images
-Even though the tile sizes are chosen to be large enough that too many
galaxies/targets do not fall on the edges of the tiles, inevitably some do.
-So when you simply crop the image of such targets from one tile, you will miss
a large area of the surrounding sky (which is essential in estimating the
noise).
-Therefore in its WCS mode, Crop will stitch parts of the tiles that are
relevant for a target (with the given width) from all the input images that
cover that region into the output.
-Of course, the tiles have to be present in the list of input files.
-Besides cropping postage stamps around certain coordinates, Crop can also crop
arbitrary polygons from an image (or a set of tiles by stitching the relevant
parts of different tiles within the polygon), see @option{--polygon} in
@ref{Invoking astcrop}.
-Alternatively, it can crop out rectangular regions through the
@option{--section} option from one image, see @ref{Crop section syntax}.
-@menu
-* Crop modes:: Basic modes to define crop region.
-* Crop section syntax:: How to define a section to crop.
-* Blank pixels:: Pixels with no value.
-* Invoking astcrop:: Calling Crop on the command-line
-@end menu
-@node Crop modes, Crop section syntax, Crop, Crop
-@subsection Crop modes
-In order to be comprehensive, intuitive, and easy to use, there are two ways
to define the crop:
-@enumerate
-@item
-From its center and side length.
-For example, if you already know the coordinates of an object and want to
inspect it in an image or to generate postage stamps of a catalog containing
many such coordinates.
-@item
-The vertices of the crop region, this can be useful for larger crops over
-many targets, for example, to crop out a uniformly deep, or contiguous,
-region of a large survey.
-@end enumerate
-
-Irrespective of how the crop region is defined, the coordinates to define the
crop can be in Image (pixel) or World Coordinate System (WCS) standards.
-All coordinates are read as floating point numbers (not integers, except for
the @option{--section} option, see below).
-By setting the @emph{mode} in Crop, you define the standard that the given
coordinates must be interpreted.
-Here, the different ways to specify the crop region are discussed within each
standard.
-For the full list options, please see @ref{Invoking astcrop}.
-
-When the crop is defined by its center, the respective (integer) central pixel
position will be found internally according to the FITS standard.
-To have this pixel positioned in the center of the cropped region, the final
cropped region will have an add number of pixels (even if you give an even
number to @option{--width} in image mode).
-Furthermore, when the crop is defined as by its center, Crop allows you to
only keep crops what do not have any blank pixels in the vicinity of their
center (your primary target).
-This can be very convenient when your input catalog/coordinates originated
from another survey/filter which is not fully covered by your input image, to
learn more about this feature, please see the description of the
@option{--checkcenter} option in @ref{Invoking astcrop}.
-@table @asis
-@item Image coordinates
-In image mode (@option{--mode=img}), Crop interprets the pixel coordinates and
widths in units of the input data-elements (for example, pixels in an image,
not world coordinates).
-In image mode, only one image may be input.
-The output crop(s) can be defined in multiple ways as listed below.
-@table @asis
-@item Center of multiple crops (in a catalog)
-The center of (possibly multiple) crops are read from a text file.
-In this mode, the columns identified with the @option{--coordcol} option are
interpreted as the center of a crop with a width of @option{--width} pixels
along each dimension.
-The columns can contain any floating point value.
-The value to @option{--output} option is seen as a directory which will host
(the possibly multiple) separate crop files, see @ref{Crop output} for more.
-For a tutorial using this feature, please see @ref{Reddest clumps cutouts and
parallelization}.
-@item Center of a single crop (on the command-line)
-The center of the crop is given on the command-line with the @option{--center}
option.
-The crop width is specified by the @option{--width} option along each
dimension.
-The given coordinates and width can be any floating point number.
+@node Query, , Table, Data containers
+@section Query
-@item Vertices of a single crop
-In Image mode there are two options to define the vertices of a region to
crop: @option{--section} and @option{--polygon}.
-The former is lower-level (does not accept floating point vertices, and only a
rectangular region can be defined), it is also only available in Image mode.
-Please see @ref{Crop section syntax} for a full description of this method.
+@cindex IVOA
+@cindex Query
+@cindex TAP (Table Access Protocol)
+@cindex ADQL (Astronomical Data Query Language)
+@cindex Astronomical Data Query Language (ADQL)
+There are many astronomical databases available for downloading astronomical
data.
+Most follow the International Virtual Observatory Alliance (IVOA,
@url{https://ivoa.net}) standards (and in particular the Table Access Protocol,
or TAP@footnote{@url{https://ivoa.net/documents/TAP}}).
+With TAP, it is possible to submit your queries via a command-line downloader
(for example, @command{curl}) to only get specific tables, targets (rows in a
table) or measurements (columns in a table): you do not have to download the
full table (which can be very large in some cases)!
+These customizations are done through the Astronomical Data Query Language
(ADQL@footnote{@url{https://ivoa.net/documents/ADQL}}).
-The latter option (@option{--polygon}) is a higher-level method to define any
polygon (with any number of vertices) with floating point values.
-Please see the description of this option in @ref{Invoking astcrop} for its
syntax.
-@end table
+Therefore, if you are sufficiently familiar with TAP and ADQL, you can easily
custom-download any part of an online dataset.
+However, you also need to keep a record of the URLs of each database and in
many cases, the commands will become long and hard/buggy to type on the
command-line.
+On the other hand, most astronomers do not know TAP or ADQL at all, and are
forced to go to the database's web page which is slow (it needs to download so
many images, and has too much annoying information), requires manual
interaction (further making it slow and buggy), and cannot be automated.
-@item WCS coordinates
-In WCS mode (@option{--mode=wcs}), the coordinates and width are interpreted
using the World Coordinate System (WCS, that must accompany the dataset), not
pixel coordinates.
-You can optionally use @option{--widthinpix} for the width to be interpreted
in pixels (even though the coordinates are in WCS).
-In WCS mode, Crop accepts multiple datasets as input.
-When the cropped region (defined by its center or vertices) overlaps with
multiple of the input images/tiles, the overlapping regions will be taken from
the respective input (they will be stitched when necessary for each output
crop).
+Gnuastro's Query program is designed to be the middle-man in this process: it
provides a simple high-level interface to let you specify your constraints on
what you want to download.
+It then internally constructs the command to download the data based on your
inputs and runs it to download your desired data.
+Query also prints the full command before it executes it (if not called with
@option{--quiet}).
+Also, if you ask for a FITS output table, the full command is written into its
0-th extension along with other input parameters to query (all Gnuastro
programs generally keep their input configuration parameters as FITS keywords
in the zero-th output).
+You can see it with Gnuastro's Fits program, like below:
-In this mode, the input images do not necessarily have to be the same size,
they just need to have the same orientation and pixel resolution.
-Currently only orientation along the celestial coordinates is accepted, if
your input has a different orientation or resolution you can use Warp's
@option{--gridfile} option to align the image before cropping it (see
@ref{Warp}).
+@example
+$ astfits query-output.fits -h0
+@end example
-Each individual input image/tile can even be smaller than the final crop.
-In any case, any part of any of the input images which overlaps with the
desired region will be used in the crop.
-Note that if there is an overlap in the input images/tiles, the pixels from
the last input image read are going to be used for the overlap.
-Crop will not change pixel values, so it assumes your overlapping tiles were
cutout from the same original image.
-There are multiple ways to define your cropped region as listed below.
+With the full command used to download the dataset, you only need a minimal
knowledge of ADQL to do lower-level customizations on your downloaded dataset.
+You can simply copy that command and change the parts of the query string you
want: ADQL is very powerful!
+For example, you can ask the server to do mathematical operations on the
columns and apply selections after those operations, or combine/match multiple
datasets.
+We will try to add high-level interfaces for such capabilities, but generally,
do not limit yourself to the high-level operations (that cannot cover
everything!).
-@table @asis
+@menu
+* Available databases:: List of available databases to Query.
+* Invoking astquery:: Inputs, outputs and configuration of Query.
+@end menu
-@item Center of multiple crops (in a catalog)
-Similar to catalog inputs in Image mode (above), except that the values along
each dimension are assumed to have the same units as the dataset's WCS
information.
-For example, the central RA and Dec value for each crop will be read from the
first and second calls to the @option{--coordcol} option.
-The width of the cropped box (in units of the WCS, or degrees in RA and Dec
mode) must be specified with the @option{--width} option.
-You can optionally use @option{--widthinpix} for the value of @option{--width}
to be interpreted in pixels.
+@node Available databases, Invoking astquery, Query, Query
+@subsection Available databases
-@item Center of a single crop (on the command-line)
-You can specify the center of only one crop box with the @option{--center}
option.
-If it exists in the input images, it will be cropped similar to the catalog
mode, see above also for @code{--width}.
+The current list of databases supported by Query are listed at the end of this
section.
+To get the list of available datasets within each database, you can use the
@option{--information} option.
+for example, with the command below you can get a list of the roughly 100
datasets that are available within the ESA Gaia server with their description:
-@item Vertices of a single crop
-The @option{--polygon} option is a high-level method to define any convex
polygon (with any number of vertices).
-Please see the description of this option in @ref{Invoking astcrop} for its
syntax.
-@end table
+@example
+$ astquery gaia --information
+@end example
-@cartouche
@noindent
-@strong{CAUTION:} In WCS mode, the image has to be aligned with the celestial
coordinates, such that the first FITS axis is parallel (opposite direction) to
the Right Ascension (RA) and the second FITS axis is parallel to the
declination.
-If these conditions are not met for an image, Crop will warn you and abort.
-You can use Warp to align the input image to standard celestial coordinates,
see @ref{Warp}.
-@end cartouche
+However, other databases like VizieR host many more datasets (tens of
thousands!).
+Therefore it is very inconvenient to get the @emph{full} information every
time you want to find your dataset of interest (the full metadata file VizieR
is more than 20Mb).
+In such cases, you can limit the downloaded and displayed information with the
@code{--limitinfo} option.
+For example, with the first command below, you can get all datasets relating
to the MUSE (an instrument on the Very Large Telescope), and those that include
Roland Bacon (Principle Investigator of MUSE) as an author (@code{Bacon, R.}).
+Recall that @option{-i} is the short format of @option{--information}.
-@end table
+@example
+$ astquery vizier -i --limitinfo=MUSE
+$ astquery vizier -i --limitinfo="Bacon R."
+@end example
-As a summary, if you do not specify a catalog, you have to define the cropped
region manually on the command-line.
-In any case the mode is mandatory for Crop to be able to interpret the values
given as coordinates or widths.
+Once you find the recognized name of your desired dataset, you can see the
column information of that dataset with adding the dataset name.
+For example, with the command below you can see the column metadata in the
@code{J/A+A/608/A2/udf10} dataset (one of the datasets in the search above)
using this command:
+@example
+$ astquery vizier --dataset=J/A+A/608/A2/udf10 -i
+@end example
-@node Crop section syntax, Blank pixels, Crop modes, Crop
-@subsection Crop section syntax
+@cindex SDSS DR12
+For very popular datasets of a database, Query provides an easier-to-remember
short name that you can feed to @option{--dataset}.
+This short name will map to the officially recognized name of the dataset on
the server.
+In this mode, Query will also set positional columns accordingly.
+For example, most VizieR datasets have an @code{RAJ2000} column (the RA and
the epoch of 2000) so it is the default RA column name for coordinate search
(using @option{--center} or @option{--overlapwith}).
+However, some datasets do not have this column (for example, SDSS DR12).
+So when you use the short name and Query knows about this dataset, it will
internally set the coordinate columns that SDSS DR12 has: @code{RA_ICRS} and
@code{DEC_ICRS}.
+Recall that you can always change the coordinate columns with @option{--ccol}.
-@cindex Crop a given section of image
-When in image mode, one of the methods to crop only one rectangular section
from the input image is to use the @option{--section} option.
-Crop has a powerful syntax to read the box parameters from a string of
characters.
-If you leave certain parts of the string to be empty, Crop can fill them for
you based on the input image sizes.
+For example, in the VizieR and Gaia databases, the recognized name for data
release 3 data is respectively @code{I/355/gaiadr3} and
@code{gaiadr3.gaia_source}.
+These technical names are hard to remember.
+Therefore Query provides @code{gaiadr3} (for VizieR) and @code{dr3} (for ESA's
Gaia database) shortcuts which you can give to @option{--dataset} instead.
+They will be internally mapped to the fully recognized name by Query.
+In the list below that describes the available databases, the available short
names, that are recognized for each, are also listed.
-@cindex Define section to crop
-To define a box, you need the coordinates of two points: the first (@code{X1},
@code{Y1}) and the last pixel (@code{X2}, @code{Y2}) pixel positions in the
image, or four integer numbers in total.
-The four coordinates can be specified with one string in this format:
`@command{X1:X2,Y1:Y2}'.
-This string is given to the @option{--section} option.
-Therefore, the pixels along the first axis that are @mymath{\geq}@command{X1}
and @mymath{\leq}@command{X2} will be included in the cropped image.
-The same goes for the second axis.
-Note that each different term will be read as an integer, not a float.
+@cartouche
+@noindent
+@strong{Not all datasets support TAP:} Large databases like VizieR have TAP
access for all their datasets.
+However, smaller databases have not implemented TAP for all their tables.
+Therefore some datasets that are searchable in their web interface may not be
available for a TAP search.
+To see the full list of TAP-ed datasets in a database, use the
@option{--information} (or @option{-i}) option with the dataset name like the
command below.
-The reason it only accepts integers is that @option{--section} is a low-level
option (which is also very fast!).
-For a higher-level way to specify region (any polygon, not just a box), please
see the @option{--polygon} option in @ref{Crop options}.
-Also note that in the FITS standard, pixel indexes along each axis start from
unity(1) not zero(0).
+@example
+$ astquery astron -i
+@end example
-@cindex Crop section format
-You can omit any of the values and they will be filled automatically.
-The left hand side of the colon (@command{:}) will be filled with @command{1},
and the right side with the image size.
-So, @command{2:,:} will include the full range of pixels along the second axis
and only those with a first axis index larger than @command{2} in the first
axis.
-If the colon is omitted for a dimension, then the full range is automatically
used.
-So the same string is also equal to @command{2:,} or @command{2:} or even
@command{2}.
-If you want such a case for the second axis, you should set it to:
@command{,2}.
+@noindent
+If your desired dataset is not in this list, but has web-access, contact the
database maintainers and ask them to add TAP access for it.
+After they do it, you should see the name added to the output list of the
command above.
+@end cartouche
-If you specify a negative value, it will be seen as before the indexes of the
image which are outside the image along the bottom or left sides when viewed in
SAO DS9.
-In case you want to count from the top or right sides of the image, you can
use an asterisk (@option{*}).
-When confronted with a @option{*}, Crop will replace it with the maximum
length of the image in that dimension.
-So @command{*-10:*+10,*-20:*+20} will mean that the crop box will be
@math{20\times40} pixels in size and only include the top corner of the input
image with 3/4 of the image being covered by blank pixels, see @ref{Blank
pixels}.
+The list of databases recognized by Query (and their names in Query) is
described below.
+Since Query is a new member of the Gnuastro family (first available in
Gnuastro 0.14), this list will hopefully grow significantly in the next
releases.
+If you have any particular datasets in mind, please let us know by sending an
email to @code{bug-gnuastro@@gnu.org}.
+If the dataset supports IVOA's TAP (Table Access Protocol), it should be very
easy to add.
-If you feel more comfortable with space characters between the values, you can
use as many space characters as you wish, just be careful to put your value in
double quotes, for example, @command{--section="5:200, 123:854"}.
-If you forget the quotes, anything after the first space will not be seen by
@option{--section} and you will most probably get an error because the rest of
your string will be read as a filename (which most probably does not exist).
-See @ref{Command-line} for a description of how the command-line works.
+@table @code
-@node Blank pixels, Invoking astcrop, Crop section syntax, Crop
-@subsection Blank pixels
+@item astron
+@cindex ASTRON
+@cindex Radio astronomy
+The ASTRON Virtual Observatory service (@url{https://vo.astron.nl}) is a
database focused on radio astronomy data and images, primarily those collected
by ASTRON itself.
+A query to @code{astron} is submitted to
@code{https://vo.astron.nl/__system__/tap/run/tap/sync}.
-@cindex Blank pixel
-The cropped box can potentially include pixels that are beyond the image range.
-For example, when a target in the input catalog was very near the edge of the
input image.
-The parts of the cropped image that were not in the input image will be filled
with the following two values depending on the data type of the image.
-In both cases, SAO DS9 will not color code those pixels.
+Here is the list of short names for dataset(s) in ASTRON's VO service:
@itemize
@item
-If the data type of the image is a floating point type (float or double), IEEE
NaN (Not a number) will be used.
-@item
-For integer types, pixels out of the image will be filled with the value of
the @command{BLANK} keyword in the cropped image header.
-The value assigned to it is the lowest value possible for that type, so you
will probably never need it any way.
-Only for the unsigned character type (@command{BITPIX=8} in the FITS header),
the maximum value is used because it is unsigned, the smallest value is zero
which is often meaningful.
+@code{tgssadr --> tgssadr.main}
@end itemize
-You can ask for such blank regions to not be included in the output crop image
using the @option{--noblank} option.
-In such cases, there is no guarantee that the image size of your outputs are
what you asked for.
-In some survey images, unfortunately they do not use the @command{BLANK} FITS
keyword.
-Instead they just give all pixels outside of the survey area a value of zero.
-So by default, when dealing with float or double image types, any values that
are 0.0 are also regarded as blank regions.
-This can be turned off with the @option{--zeroisnotblank} option.
+@item gaia
+@cindex Gaia catalog
+@cindex Catalog, Gaia
+@cindex Database, Gaia
+The Gaia project (@url{https://www.cosmos.esa.int/web/gaia}) database which is
a large collection of star positions on the celestial sphere, as well as
peculiar velocities, parallaxes and magnitudes in some bands among many others.
+Besides scientific studies (like studying resolved stellar populations in the
Galaxy and its halo), Gaia is also invaluable for raw data calibrations, like
astrometry.
+A query to @code{gaia} is submitted to
@code{https://gea.esac.esa.int/tap-server/tap/sync}.
+Here is the list of short names for popular datasets within Gaia:
+@itemize
+@item
+@code{dr3 --> gaiadr3.gaia_source}
+@item
+@code{edr3 --> gaiaedr3.gaia_source}
+@item
+@code{dr2 --> gaiadr2.gaia_source}
+@item
+@code{dr1 --> gaiadr1.gaia_source}
+@item
+@code{tycho2 --> public.tycho2}
+@item
+@code{hipparcos --> public.hipparcos}
+@end itemize
-@node Invoking astcrop, , Blank pixels, Crop
-@subsection Invoking Crop
+@item ned
+@cindex NASA/IPAC Extragalactic Database (NED)
+@cindex NED (NASA/IPAC Extragalactic Database)
+The NASA/IPAC Extragalactic Database (NED, @url{http://ned.ipac.caltech.edu})
is a fusion database, integrating the information about extra-galactic sources
from many large sky surveys into a single catalog.
+It covers the full spectrum, from Gamma rays to radio frequencies and is
updated when new data arrives.
+A TAP query to @code{ned} is submitted to
@code{https://ned.ipac.caltech.edu/tap/sync}.
-Crop will crop a region from an image.
-If in WCS mode, it will also stitch parts from separate images in the input
files.
-The executable name is @file{astcrop} with the following general template
+@itemize
+@item
+@code{objdir --> NEDTAP.objdir}: default TAP-based dataset in NED.
-@example
-$ astcrop [OPTION...] [ASCIIcatalog] ASTRdata ...
-@end example
+@item
+@cindex VOTable
+@code{extinction}: A command-line interface to the
@url{https://ned.ipac.caltech.edu/extinction_calculator, NED Extinction
Calculator}.
+It only takes a central coordinate and returns a VOTable of the calculated
extinction in many commonly used filters at that point.
+As a result, options like @option{--width} or @option{--radius} are not
supported.
+However, Gnuastro does not yet support the VOTable format.
+Therefore, if you specify an @option{--output} file, it should have an
@file{.xml} suffix and the downloaded file will not be checked.
+Until VOTable support is added to Gnuastro, you can use GREP, AWK and SED to
convert the VOTable data into a FITS table with a command like below (assuming
the queried VOTable is called @file{ned-extinction.xml}):
-@noindent
-One line examples:
-
-@example
-## Crop all objects in cat.txt from image.fits:
-$ astcrop --catalog=cat.txt image.fits
-
-## Crop all options in catalog (with RA,DEC) from all the files
-## ending in `_drz.fits' in `/mnt/data/COSMOS/':
-$ astcrop --mode=wcs --catalog=cat.txt /mnt/data/COSMOS/*_drz.fits
-
-## Crop the outer 10 border pixels of the input image and give
-## the output HDU a name ('EXTNAME' keyword in FITS) of 'mysection'.
-$ astcrop --section=10:*-10,10:*-10 --hdu=2 image.fits \
- --metaname=mysection
-
-## Crop region around RA and Dec of (189.16704, 62.218203):
-$ astcrop --mode=wcs --center=189.16704,62.218203 goodsnorth.fits
-
-## Same crop above, but coordinates given in sexagesimal (you can
-## also use ':' between the sexagesimal components).
-$ astcrop --mode=wcs --center=12h36m40.08,62d13m5.53 goodsnorth.fits
+@verbatim
+grep '^<TR><TD>' ned-extinction.xml \
+ | sed -e's|<TR><TD>||' \
+ -e's|</TD></TR>||' \
+ -e's|</TD><TD>|@|g' \
+ | awk 'BEGIN{FS="@"; \
+ print "# Column 1: FILTER [name,str15] Filter name"; \
+ print "# Column 2: CENTRAL [um,f32] Central Wavelength"; \
+ print "# Column 3: EXTINCTION [mag,f32] Galactic Ext."; \
+ print "# Column 4: ADS_REF [ref,str50] ADS reference"} \
+ {printf "%-15s %g %g %s\n", $1, $2, $3, $4}' \
+ | asttable -oned-extinction.fits
+@end verbatim
-## Crop region around pixel coordinate (568.342, 2091.719):
-$ astcrop --mode=img --center=568.342,2091.719 --width=201 image.fits
+Once the table is in FITS, you can easily get the extinction for a certain
filter (for example, the @code{SDSS r} filter) like the command below:
-## Crop all HDUs within a FITS file at a certain coordinate, while
-## preserving the names of the HDUs in the output.
-$ for hdu in $(astfits input.fits --listimagehdus); do \
- astcrop input.fits --hdu=$hdu --append --output=crop.fits \
- --metaname=$hdu --mode=wcs --center=189.16704,62.218203 \
- --width=10/3600
- done
+@example
+asttable ned-extinction.fits --equal=FILTER,"SDSS r" \
+ -cEXTINCTION
@end example
+@end itemize
-@noindent
-Crop has one mandatory argument which is the input image name(s), shown above
with @file{ASTRdata ...}.
-You can use shell expansions, for example, @command{*} for this if you have
lots of images in WCS mode.
-If the crop box centers are in a catalog, you can use the @option{--catalog}
option.
-In other cases, you have to provide the single cropped output parameters must
be given with command-line options.
-See @ref{Crop output} for how the output file name(s) can be specified.
-For the full list of general options to all Gnuastro programs (including
Crop), please see @ref{Common options}.
+@item vizier
+@cindex VizieR
+@cindex CDS, VizieR
+@cindex Catalog, Vizier
+@cindex Database, VizieR
+Vizier (@url{https://vizier.u-strasbg.fr}) is arguably the largest catalog
database in astronomy: containing more than 20500 catalogs as of mid January
2021.
+Almost all published catalogs in major projects, and even the tables in many
papers are archived and accessible here.
+For example, VizieR also has a full copy of the Gaia database mentioned below,
with some additional standardized columns (like RA and Dec in J2000).
-Floating point numbers can be used to specify the crop region (except the
@option{--section} option, see @ref{Crop section syntax}).
-In such cases, the floating point values will be used to find the desired
integer pixel indices based on the FITS standard.
-Hence, Crop ultimately does not do any sub-pixel cropping (in other words, it
does not change pixel values).
-If you need such crops, you can use @ref{Warp} to first warp the image to the
a new pixel grid, then crop from that.
-For example, let's assume you want a crop from pixels 12.982 to 80.982 along
the first dimension.
-You should first translate the image by @mymath{-0.482} (note that the edge of
a pixel is at integer multiples of @mymath{0.5}).
-So you should run Warp with @option{--translate=-0.482,0} and then crop the
warped image with @option{--section=13:81}.
+The current implementation of @option{--limitinfo} only looks into the
description of the datasets, but since VizieR is so large, there is still a lot
of room for improvement.
+Until then, if @option{--limitinfo} is not sufficient, you can use VizieR's
own web-based search for your desired dataset:
@url{http://cdsarc.u-strasbg.fr/viz-bin/cat}
-There are two ways to define the cropped region: with its center or its
vertices.
-See @ref{Crop modes} for a full description.
-In the former case, Crop can check if the central region of the cropped image
is indeed filled with data or is blank (see @ref{Blank pixels}), and not
produce any output when the center is blank, see the description under
@option{--checkcenter} for more.
-@cindex Asynchronous thread allocation
-When in catalog mode, Crop will run in parallel unless you set
@option{--numthreads=1}, see @ref{Multi-threaded operations}.
-Note that when multiple outputs are created with threads, the outputs will not
be created in the same order.
-This is because the threads are asynchronous and thus not started in order.
-This has no effect on each output, see @ref{Reddest clumps cutouts and
parallelization} for a tutorial on effectively using this feature.
+Because VizieR curates such a diverse set of data from tens of thousands of
projects and aims for interoperability between them, the column names in VizieR
may not be identical to the column names in the surveys' own databases (Gaia in
the example above).
+A query to @code{vizier} is submitted to
@code{http://tapvizier.u-strasbg.fr/TAPVizieR/tap/sync}.
-@menu
-* Crop options:: A list of all the options with explanation.
-* Crop output:: The outputs of Crop.
-* Crop known issues:: Known issues in running Crop.
-@end menu
+@cindex 2MASS All-Sky Catalog
+@cindex AKARI/FIS All-Sky Survey
+@cindex AllWISE Data Release
+@cindex AAVSO Photometric All Sky Survey, DR9
+@cindex CatWISE 2020 catalog
+@cindex Dark Energy Survey data release 1
+@cindex GAIA Data Release (2 or 3)
+@cindex All-sky Survey of GALEX DR5
+@cindex Naval Observatory Merged Astrometric Dataset
+@cindex Pan-STARRS Data Release 1
+@cindex SDSS Photometric Catalogue, Release 12
+@cindex Whole-Sky USNO-B1.0 Catalog
+@cindex U.S. Naval Observatory CCD Astrograph Catalog
+@cindex Band-merged unWISE Catalog
+@cindex WISE All-Sky data Release
+Here is the list of short names for popular datasets within VizieR (sorted
alphabetically by their short name).
+Please feel free to suggest other major catalogs (covering a wide area or
commonly used in your field)..
+For details on each dataset with necessary citations, and links to web pages,
look into their details with their ViziR names in
@url{https://vizier.u-strasbg.fr/viz-bin/VizieR}.
+@itemize
+@item
+@code{2mass --> II/246/out} (2MASS All-Sky Catalog)
+@item
+@code{akarifis --> II/298/fis} (AKARI/FIS All-Sky Survey)
+@item
+@code{allwise --> II/328/allwise} (AllWISE Data Release)
+@item
+@code{apass9 --> II/336/apass9} (AAVSO Photometric All Sky Survey, DR9)
+@item
+@code{catwise --> II/365/catwise} (CatWISE 2020 catalog)
+@item
+@code{des1 --> II/357/des_dr1} (Dark Energy Survey data release 1)
+@item
+@code{gaiadr3 --> I/355/gaiadr3} (GAIA Data Release 3)
+@item
+@code{gaiaedr3 --> I/350/gaiadr3} (GAIA Early Data Release 3)
+@item
+@code{gaiadr2 --> I/345/gaia2} (GAIA Data Release 2)
+@item
+@code{galex5 --> II/312/ais} (All-sky Survey of GALEX DR5)
+@item
+@code{nomad --> I/297/out} (Naval Observatory Merged Astrometric Dataset)
+@item
+@code{panstarrs1 --> II/349/ps1} (Pan-STARRS Data Release 1).
+@item
+@code{ppmxl --> I/317/sample} (Positions and proper motions on the ICRS)
+@item
+@code{sdss12 --> V/147/sdss12} (SDSS Photometric Catalogue, Release 12)
+@item
+@code{usnob1 --> I/284/out} (Whole-Sky USNO-B1.0 Catalog)
+@item
+@code{ucac5 --> I/340/ucac5} (5th U.S. Naval Obs. CCD Astrograph Catalog)
+@item
+@code{unwise --> II/363/unwise} (Band-merged unWISE Catalog)
+@item
+@code{wise --> II/311/wise} (WISE All-Sky data Release)
+@end itemize
+@end table
-@node Crop options, Crop output, Invoking astcrop, Invoking astcrop
-@subsubsection Crop options
-The options can be classified into the following contexts: Input, Output and
operating mode options.
-Options that are common to all Gnuastro program are listed in @ref{Common
options} and will not be repeated here.
-When you are specifying the crop vertices yourself (through
@option{--section}, or @option{--polygon}) on relatively small regions
(depending on the resolution of your images) the outputs from image and WCS
mode can be approximately equivalent.
-However, as the crop sizes get large, the curved nature of the WCS coordinates
have to be considered.
-For example, when using @option{--section}, the right ascension of the bottom
left and top left corners will not be equal.
-If you only want regions within a given right ascension, use
@option{--polygon} in WCS mode.
-@noindent
-Input image parameters:
-@table @option
+@node Invoking astquery, , Available databases, Query
+@subsection Invoking Query
-@item --hstartwcs=INT
-Specify the first keyword card (line number) to start finding the input image
world coordinate system information.
-This is useful when certain header keywords of the input may cause bad
conflicts with your crop (see an example described below).
-To get line numbers of the header keywords, you can pipe the fully printed
header into @command{cat -n} like below:
+Query provides a high-level interface to downloading subsets of data from
databases.
+The executable name is @file{astquery} with the following general template
@example
-$ astfits image.fits -h1 | cat -n
+$ astquery DATABASE-NAME [OPTION...] ...
@end example
-@cindex CANDELS survey
-For example, distortions have only been present in WCSLIB from version 5.15
(released in mid 2016).
-Therefore some pipelines still apply their own specific set of WCS keywords
for distortions and put them into the image header along with those that WCSLIB
does recognize.
-So now that WCSLIB recognizes most of the standard distortion parameters, they
will get confused with the old ones and give wrong results.
-For example, in the CANDELS-GOODS South images that were created before WCSLIB
5.15@footnote{@url{https://archive.stsci.edu/pub/hlsp/candels/goods-s/gs-tot/v1.0/}}.
+@noindent
+One line examples:
-The two @option{--hstartwcs} and @option{--hendwcs} are thus provided so when
using older datasets, you can specify what region in the FITS headers you want
to use to read the WCS keywords.
-Note that this is only relevant for reading the WCS information, basic data
information like the image size are read separately.
-These two options will only be considered when the value to @option{--hendwcs}
is larger than that of @option{--hstartwcs}.
-So if they are equal or @option{--hstartwcs} is larger than
@option{--hendwcs}, then all the input keywords will be parsed to get the WCS
information of the image.
+@example
-@item --hendwcs=INT
-Specify the last keyword card to read for specifying the image world
coordinate system on the input images.
-See @option{--hstartwcs}
+## Information about all datasets in ESA's GAIA database:
+$ astquery gaia --information
-@end table
+## Only show catalogs in VizieR that have 'MUSE' in their
+## description. The '-i' is short for '--information'.
+$ astquery vizier -i --limitinfo=MUSE
-@noindent
-Crop box parameters:
-@table @option
+## List of columns in 'J/A+A/608/A2/udf10' (one of the above).
+$ astquery vizier --dataset=J/A+A/608/A2/udf10 -i
-@item -c FLT[,FLT[,...]]
-@itemx --center=FLT[,FLT[,...]]
-The central position of the crop in the input image.
-The positions along each dimension must be separated by a comma (@key{,}) and
fractions are also acceptable.
-The comma-separated values can either be in degrees (a single number), or
sexagesimal (@code{_h_m_} for RA, @code{_d_m_} for Dec, or @code{_:_:_} for
both).
+## ID, RA and Dec of all Gaia sources within an image.
+$ astquery gaia --dataset=dr3 --overlapwith=image.fits \
+ -csource_id,ra,dec
-The number of values given to this option must be the same as the dimensions
of the input dataset.
-The width of the crop should be set with @code{--width}.
-The units of the coordinates are read based on the value to the
@option{--mode} option, see below.
+## RA, Dec and Spectroscopic redshifts of objects in SDSS DR12
+## spectroscopic redshift that overlap with 'image.fits'.
+$ astquery vizier --dataset=sdss12 --overlapwith=image.fits \
+ -cRA_ICRS,DE_ICRS,zsp --range=zsp,1e-10,inf
-@item -O STR
-@itemx --mode=STR
-Mode to interpret the crop's coordinates (for example with @option{--center},
@option{--catalog} or @option{--polygon}).
-The value must either be @option{img} (to assume image/pixel coordinates) or
@option{wcs} (to assume WCS, usually RA/Dec, coordinates), see @ref{Crop modes}
for a full description.
+## All columns of all entries in the Gaia DR3 catalog (hosted at
+## VizieR) within 1 arc-minute of the given coordinate.
+$ astquery vizier --dataset=gaiadr3 --output=my-gaia.fits \
+ --center=113.8729761,31.9027152 --radius=1/60 \
-@item -w FLT[,FLT[,...]]
-@itemx --width=FLT[,FLT[,...]]
-Width of the cropped region about coordinate given to @option{--center}.
-If in WCS mode, value(s) given to this option will be read in the same units
as the dataset's WCS information along this dimension (unless
@option{--widthinpix} is given).
-This option may take either a single value (to be used for all dimensions:
@option{--width=10} in image-mode will crop a @mymath{10\times10} pixel image)
or multiple values (a specific value for each dimension: @option{--width=10,20}
in image-mode will crop a @mymath{10\times20} pixel image).
+## Similar to above, but only ID, RA and Dec columns for objects with
+## magnitude range 10 to 15. In VizieR, this column is called 'Gmag'.
+## Also, using sexagesimal coordinates instead of degrees for center.
+$ astquery vizier --dataset=gaiadr3 --output=my-gaia.fits \
+ --center=07h35m29.51,31d54m9.77 --radius=1/60 \
+ --range=Gmag,10:15 -cDR3Name,RAJ2000,DEJ2000
+@end example
-The @code{--width} option also accepts fractions.
-For example, if you want the width of your crop to be 3 by 5 arcseconds along
RA and Dec respectively and you are in wcs-mode, you can use:
@option{--width=3/3600,5/3600}.
+Query takes a single argument which is the name of the database.
+For the full list of available databases and accessing them, see
@ref{Available databases}.
+There are two methods to query the databases, each is more fully discussed in
its option's description below.
+@itemize
+@item
+@strong{Low-level:}
+With @option{--query} you can directly give a raw query statement that is
recognized by the database.
+This is very low level and will require a good knowledge of the database's
query language, but of course, it is much more powerful.
+If this option is given, the raw string is directly passed to the server and
all other constraints/options (for Query's high-level interface) are ignored.
+@item
+@strong{High-level:}
+With the high-level options (like @option{--column}, @option{--center},
@option{--radius}, @option{--range} and other constraining options below), the
low-level query will be constructed automatically for the particular database.
+This method is only limited to the generic capabilities that Query provides
for all servers.
+So @option{--query} is more powerful, however, in this mode, you do not need
any knowledge of the database's query language.
+You can see the internally generated query on the terminal (if
@option{--quiet} is not used) or in the 0-th extension of the output (if it is
a FITS file).
+This full command contains the internally generated query.
+@end itemize
-The final output will have an odd number of pixels to allow easy
identification of the pixel which keeps your requested coordinate (from
@option{--center} or @option{--catalog}).
-If you want an even sided crop, you can run Crop afterwards with
@option{--section=":*-1,:*-1"} or @option{--section=2:,2:} (depending on which
side you do not need), see @ref{Crop section syntax}.
+The name of the downloaded output file can be set with @option{--output}.
+The requested output format can have any of the @ref{Recognized table formats}
(currently @file{.txt} or @file{.fits}).
+Like all Gnuastro programs, if the output is a FITS file, the zero-th/first
HDU of the output will contain all the command-line options given to Query as
well as the full command used to access the server.
+When @option{--output} is not set, the output name will be in the format of
@file{NAME-STRING.fits}, where @file{NAME} is the name of the database and
@file{STRING} is a randomly selected 6-character set of numbers and alphabetic
characters.
+With this feature, a second run of @command{astquery} that is not called with
@option{--output} will not over-write an already downloaded one.
+Generally, when calling Query more than once, it is recommended to set an
output name for each call based on your project's context.
-The basic reason for making an odd-sided crop is that your given central
coordinate will ultimately fall within a discrete pixel in the image (defined
by the FITS standard).
-When the crop has an odd number of pixels in each dimension, that pixel can be
very well defined as the ``central'' pixel of the crop, making it unambiguously
easy to identify.
-However, for an even-sided crop, it will be very hard to identify the central
pixel (it can be on any of the four pixels adjacent to the central point of the
image!).
+The outputs of Query will have a common output format, irrespective of the
used database.
+To achieve this, Query will ask the databases to provide a FITS table output
(for larger tables, FITS can consume much less download volume).
+After downloading is complete, the raw downloaded file will be read into
memory once by Query, and written into the file given to @option{--output}.
+The raw downloaded file will be deleted by default, but can be preserved with
the @option{--keeprawdownload} option.
+This strategy avoids unnecessary surprises depending on database.
+For example, some databases can download a compressed FITS table, even though
we ask for FITS.
+But with the strategy above, the final output will be an uncompressed FITS
file.
+The metadata that is added by Query (including the full download command) is
also very useful for future usage of the downloaded data.
+Unfortunately many databases do not write the input queries into their
generated tables.
-@item -X
-@itemx --widthinpix
-In WCS mode, interpret the value to @option{--width} as number of pixels, not
the WCS units like degrees.
-This is useful when you want a fixed crop size in pixels, even though your
center coordinates are in WCS (for example, RA and Dec).
+@table @option
-@item -l STR
-@itemx -l FLT:FLT,...
-@itemx --polygon=STR
-@itemx --polygon=FLT,FLT:FLT,FLT:...
-@cindex Sexagesimal
-Polygon vertice coordinates (when value is in @option{FLT,FLT:FLT,FLT:...}
format) or the filename of a SAO DS9 region file (when the value has no
@file{,} or @file{:} characters).
-Each vertice can either be in degrees (a single floating point number) or
sexagesimal (in formats of `@code{_h_m_}' for RA and `@code{_d_m_}' for Dec, or
simply `@code{_:_:_}' for either of them).
+@item --dry-run
+Only print the final download command to contact the server, do not actually
run it.
+This option is good when you want to check the finally constructed query or
download options given to the download program.
+You may also want to use the constructed command as a base to do further
customizations on it and run it yourself.
-The vertices are used to define the polygon: in the same order given to this
option.
-When the vertices are not necessarily ordered in the proper order (for
example, one vertice in a square comes after its diagonal opposite), you can
add the @option{--polygonsort} option which will attempt to sort the vertices
before cropping.
-Note that for concave polygons, sorting is not recommended because there is no
unique solution, for more, see the description under @option{--polygonsort}.
+@item -k
+@itemx --keeprawdownload
+Do Not delete the raw downloaded file from the database.
+The name of the raw download will have a @file{OUTPUT-raw-download.fits}
format.
+Where @file{OUTPUT} is either the base-name of the final output file (without
a suffix).
-This option can be used both in the image and WCS modes, see @ref{Crop modes}.
-If a SAO DS9 region file is used, the coordinate mode of Crop will be
determined by the contents of the file and any value given to @code{--mode} is
ignored.
-The cropped image will be the size of the rectangular region that completely
encompasses the polygon.
-By default all the pixels that are outside of the polygon will be set as blank
values (see @ref{Blank pixels}).
-However, if @option{--polygonout} is called all pixels internal to the
vertices will be set to blank.
-In WCS-mode, you may provide many FITS images/tiles: Crop will stitch them to
produce this cropped region, then apply the polygon.
+@item -i
+@itemx --information
+Print the information of all datasets (tables) within a database or all
columns within a database.
+When @option{--dataset} is specified, the latter mode (all column information)
is downloaded and printed and when it is not defined, all dataset information
(within the database) is printed.
-The syntax for the polygon vertices is similar to, and simpler than, that for
@option{--section}.
-In short, the dimensions of each coordinate are separated by a comma (@key{,})
and each vertex is separated by a colon (@key{:}).
-You can define as many vertices as you like.
-If you would like to use space characters between the dimensions and vertices
to make them more human-readable, then you have to put the value to this option
in double quotation marks.
+Some databases (like VizieR) contain tens of thousands of datasets, so you can
limit the downloaded and printed information for available databases with the
@option{--limitinfo} option (described below).
+Dataset descriptions are often large and contain a lot of text (unlike column
descriptions).
+Therefore when printing the information of all datasets within a database, the
information (e.g., database name) will be printed on separate lines before the
description.
+However, when printing column information, the output has the same format as a
similar option in Table (see @ref{Invoking asttable}).
-For example, let's assume you want to work on the deepest part of the WFC3/IR
images of Hubble Space Telescope eXtreme Deep Field (HST-XDF).
-@url{https://archive.stsci.edu/prepds/xdf/, According to the web
page}@footnote{@url{https://archive.stsci.edu/prepds/xdf/}} the deepest part is
contained within the coordinates:
+Important note to consider: the printed order of the datasets or columns is
just for displaying in the printed output.
+You cannot ask for datasets or columns based on the printed order, you need to
use dataset or column names.
-@example
-[ (53.187414,-27.779152), (53.159507,-27.759633),
- (53.134517,-27.787144), (53.161906,-27.807208) ]
-@end example
+@item -L STR
+@itemx --limitinfo=STR
+Limit the information that is downloaded and displayed (with
@option{--information}) to those that have the string given to this option in
their description.
+Note that @emph{this is case-sensitive}.
+This option is only relevant when @option{--information} is also called.
-They have provided mask images with only these pixels in the WFC3/IR images,
but what if you also need to work on the same region in the full resolution ACS
images? Also what if you want to use the CANDELS data for the shallow region?
Running Crop with @option{--polygon} will easily pull out this region of the
image for you, irrespective of the resolution.
-If you have set the operating mode to WCS mode in your nearest configuration
file (see @ref{Configuration files}), there is no need to call
@option{--mode=wcs} on the command-line.
+Databases may have thousands (or tens of thousands) of datasets.
+Therefore just the metadata (information) to show with @option{--information}
can be tens of megabytes (for example, the full VizieR metadata file is about
23Mb as of January 2021).
+Once downloaded, it can also be hard to parse manually.
+With @option{--limitinfo}, only the metadata of datasets that contain this
string @emph{in their description} will be downloaded and displayed, greatly
improving the speed of finding your desired dataset.
-@example
-$ astcrop --mode=wcs desired-filter-image(s).fits \
- --polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
- 53.134517,-27.787144 : 53.161906,-27.807208"
-@end example
+@item -Q "STR"
+@itemx --query="STR"
+Directly specify the query to be passed onto the database.
+The queries will generally contain space and other meta-characters, so we
recommend placing the query within quotations.
-@cindex SAO DS9 region file
-@cindex Region file (SAO DS9)
-More generally, you have an image and want to define the polygon yourself (it
is not already published like the example above).
-As the number of vertices increases, checking the vertex coordinates on a FITS
viewer (for example, SAO DS9) and typing them in, one by one, can be very
tedious and prone to typo errors.
-In such cases, you can make a polygon ``region'' in DS9 and using your mouse,
easily define (and visually see) it. Given that SAO DS9 has a graphic user
interface (GUI), if you do not have the polygon vertices before-hand, it is
much more easier build your polygon there and pass it onto Crop through the
region file.
+@item -s STR
+@itemx --dataset=STR
+The dataset to query within the database (not compatible with
@option{--query}).
+This option is mandatory when @option{--query} or @option{--information} are
not provided.
+You can see the list of available datasets within a database using
@option{--information} (possibly supplemented by @option{--limitinfo}).
+The output of @option{--information} will contain the recognized name of the
datasets within that database.
+You can pass the recognized name directly to this option.
+For more on finding and using your desired database, see @ref{Available
databases}.
-You can take the following steps to make an SAO DS9 region file containing
your polygon.
-Open your desired FITS image with SAO DS9 and activate its ``region'' mode
with @clicksequence{Edit@click{}Region}.
-Then define the region as a polygon with
@clicksequence{Region@click{}Shape@click{}Polygon}.
-Click on the approximate center of the region you want and a small square will
appear.
-By clicking on the vertices of the square you can shrink or expand it,
clicking and dragging anywhere on the edges will enable you to define a new
vertex.
-After the region has been nicely defined, save it as a file with
@clicksequence{Region@click{}``Save Regions''}.
-You can then select the name and address of the output file, keep the format
as @command{REG (*.reg)} and press the ``OK'' button.
-In the next window, keep format as ``ds9'' and ``Coordinate System'' as
``fk5'' for RA and Dec (or ``Image'' for pixel coordinates).
-A plain text file is now created (let's call it @file{ds9.reg}) which you can
pass onto Crop with @command{--polygon=ds9.reg}.
+@item -c STR
+@itemx --column=STR[,STR[,...]]
+The column name(s) to retrieve from the dataset in the given order (not
compatible with @option{--query}).
+If not given, all the dataset's columns for the selected rows will be queried
(which can be large!).
+This option can take multiple values in one instance (for example,
@option{--column=ra,dec,mag}), or in multiple instances (for example,
@option{-cra -cdec -cmag}), or mixed (for example, @option{-cra,dec -cmag}).
-For the expected format of the region file, see the description of
@code{gal_ds9_reg_read_polygon} in @ref{SAO DS9 library}.
-However, since SAO DS9 makes this file for you, you do not usually need to
worry about its internal format unless something un-expected happens and you
find a bug.
+In case, you do not know the full list of the dataset's column names a-priori,
and you do not want to download all the columns (which can greatly decrease
your download speed), you can use the @option{--information} option combined
with the @option{--dataset} option, see @ref{Available databases}.
-@item --polygonout
-Keep all the regions outside the polygon and mask the inner ones with blank
pixels (see @ref{Blank pixels}).
-This is practically the inverse of the default mode of treating polygons.
-Note that this option only works when you have only provided one input image.
-If multiple images are given (in WCS mode), then the full area covered by all
the images has to be shown and the polygon excluded.
-This can lead to a very large area if large surveys like COSMOS are used.
-So Crop will abort and notify you.
-In such cases, it is best to crop out the larger region you want, then mask
the smaller region with this option.
+@item -H INT
+@itemx --head=INT
+Only ask for the first @code{INT} rows of the finally selected columns, not
all the rows.
+This can be good when your search can result a large dataset, but before
downloading the full volume, you want to see the top rows and get a feeling of
what the whole dataset looks like.
-@item --polygonsort
-Sort the given set of vertices to the @option{--polygon} option.
-For a concave polygon it will sort the vertices correctly, however for a
convex polygon it there is no unique sorting, so be careful because the crop
may not be what you expected.
+@item -v FITS
+@itemx --overlapwith=FITS
+File name of FITS file containing an image (in the HDU given by
@option{--hdu}) to use for identifying the region to query in the give database
and dataset.
+Based on the image's WCS and pixel size, the sky coverage of the image is
estimated and values to the @option{--center}, @option{--width} will be
calculated internally.
+Hence this option cannot be used with @code{--center}, @code{--width} or
@code{--radius}.
+Also, since it internally generates the query, it cannot be used with
@code{--query}.
-@cindex Convex polygons
-@cindex Concave polygons
-@cindex Polygons, Convex
-@cindex Polygons, Concave
-Polygons come in two classes: convex and concave (or generally, non-convex!),
see below for a demonstration.
-Convex polygons are those where all inner angles are less than 180 degrees.
-By contrast, a concave polygon is one where an inner angle may be more than
180 degrees.
+Note that if the image has WCS distortions and the reference point for the WCS
is not within the image, the WCS will not be well-defined.
+Therefore the resulting catalog may not overlap, or correspond to a
larger/small area in the sky.
-@example
- Concave Polygon Convex Polygon
+@item -C FLT,FLT
+@itemx --center=FLT,FLT
+The spatial center position (mostly RA and Dec) to use for the automatically
generated query (not compatible with @option{--query}).
+The comma-separated values can either be in degrees (a single number), or
sexagesimal (@code{_h_m_} for RA, @code{_d_m_} for Dec, or @code{_:_:_} for
both).
- D --------C D------------- C
- \ | E / |
- \E | \ |
- / | \ |
- A--------B A ----------B
-@end example
+The given values will be compared to two columns in the database to
find/return rows within a certain region around this center position will be
requested and downloaded.
+Pre-defined RA and Dec column names are defined in Query for every database,
however you can use @option{--ccol} to select other columns to use instead.
+The region can either be a circle and the point (configured with
@option{--radius}) or a box/rectangle around the point (configured with
@option{--width}).
-@item -s STR
-@itemx --section=STR
-Section of the input image which you want to be cropped.
-See @ref{Crop section syntax} for a complete explanation on the syntax
required for this input.
+@item --ccol=STR,STR
+The name of the coordinate-columns in the dataset to compare with the values
given to @option{--center}.
+Query will use its internal defaults for each dataset (for example,
@code{RAJ2000} and @code{DEJ2000} for VizieR data).
+But each dataset is treated separately and it is not guaranteed that these
columns exist in all datasets.
+Also, more than one coordinate system/epoch may be present in a dataset and
you can use this option to construct your spatial constraint based on the
others coordinate systems/epochs.
-@item -C FITS/TXT
-@itemx --catalog=FITS/TXT
-File name of catalog for making multiple crops from the input images/cubes.
-The catalog can be in any of Gnuastro's recognized @ref{Recognized table
formats}.
-The columns containing the coordinates for the crop centers can be specified
with the @option{--coordcol} option (using column names or numbers, see
@ref{Selecting table columns}).
-The catalog can also contain the name of each crop, you can specify the column
containing the name with the @option{--namecol}.
+@item -r FLT
+@itemx --radius=FLT
+The radius about the requested center to use for the automatically generated
query (not compatible with @option{--query}).
+The radius is in units of degrees, but you can use simple division with this
option directly on the command-line.
+For example, if you want a radius of 20 arc-minutes or 20 arc-seconds, you can
use @option{--radius=20/60} or @option{--radius=20/3600} respectively (which is
much more human-friendly than @code{0.3333} or @code{0.005556}).
-@item --cathdu=STR/INT
-The HDU (extension) containing the catalog (if the file given to
@option{--catalog} is a FITS file).
-This can either be the HDU name (if it has one) or number (counting from 0).
-By default (if this option is not given), the second HDU will be used
(equivalent to @option{--cathdu=1}.
-For more on how to specify the HDU, see the explanation of the @option{--hdu}
option in @ref{Input output options}.
+@item -w FLT[,FLT]
+@itemx --width=FLT[,FLT]
+The square (or rectangle) side length (width) about the requested center to
use for the automatically generated query (not compatible with
@option{--query}).
+If only one value is given to @code{--width} the region will be a square, but
if two values are given, the widths of the query box along each dimension will
be different.
+The value(s) is (are) in the same units as the coordinate column (see
@option{--ccol}, usually RA and Dec which are degrees).
+You can use simple division for each value directly on the command-line if you
want relatively small (and more human-friendly) sizes.
+For example, if you want your box to be 1 arc-minutes along the RA and 2
arc-minutes along Dec, you can use @option{--width=1/60,2/60}.
-@item -x STR/INT
-@itemx --coordcol=STR/INT
-The column in a catalog to read as a coordinate.
-The value can be either the column number (starting from 1), or a match/search
in the table meta-data, see @ref{Selecting table columns}.
-This option must be called multiple times, depending on the number of
dimensions in the input dataset.
-If it is called more than necessary, the extra columns (later calls to this
option on the command-line or configuration files) will be ignored, see
@ref{Configuration file precedence}.
+@item -g STR,FLT,FLT
+@itemx --range=STR,FLT,FLT
+The column name and numerical range (inclusive) of acceptable values in that
column (not compatible with @option{--query}).
+This option can be called multiple times for applying range limits on many
columns in one call (thus greatly reducing the download size).
+For example, when used on the ESA gaia database, you can use
@code{--range=phot_g_mean_mag,10:15} to only get rows that have a value between
10 and 15 (inclusive on both sides) in the @code{phot_g_mean_mag} column.
-@item -n STR/INT
-@item --namecol=STR/INT
-Column selection of crop file name.
-The value can be either the column number (starting from 1), or a match/search
in the table meta-data, see @ref{Selecting table columns}.
-This option can be used both in Image and WCS modes, and not a mandatory.
-When a column is given to this option, the final crop base file name will be
taken from the contents of this column.
-The directory will be determined by the @option{--output} option (current
directory if not given) and the value to @option{--suffix} will be appended.
-When this column is not given, the row number will be used instead.
+If you want all rows larger, or smaller, than a certain number, you can use
@code{inf}, or @code{-inf} as the first or second values respectively.
+For example, if you want objects with SDSS spectroscopic redshifts larger than
2 (from the VizieR @code{sdss12} database), you can use
@option{--range=zsp,2,inf}
+
+If you want the interval to not be inclusive on both sides, you can run
@code{astquery} once and get the command that it executes.
+Then you can edit it to be non-inclusive on your desired side.
+
+@item -b STR[,STR]
+@item --noblank=STR[,STR]
+Only ask for rows that do not have a blank value in the @code{STR} column.
+This option can be called many times, and each call can have multiple column
names (separated by a comma or @key{,}).
+For example, if you want the retrieved rows to not have a blank value in
columns @code{A}, @code{B}, @code{C} and @code{D}, you can use
@command{--noblank=A -bB,C,D}.
+@item --sort=STR[,STR]
+Ask for the server to sort the downloaded data based on the given columns.
+For example, let's assume your desired catalog has column @code{Z} for
redshift and column @code{MAG_R} for magnitude in the R band.
+When you call @option{--sort=Z,MAG_R}, it will primarily sort the columns
based on the redshift, but if two objects have the same redshift, they will be
sorted by magnitude.
+You can add as many columns as you like for higher-level sorting.
@end table
-@node Crop output, Crop known issues, Crop options, Invoking astcrop
-@subsubsection Crop output
-The string given to @option{--output} option will be interpreted depending on
how many crops were requested, see @ref{Crop modes}:
-@itemize
-@item
-When a catalog is given, the value of the @option{--output} (see @ref{Common
options}) will be read as the directory to store the output cropped images.
-Hence if it does not already exist, Crop will abort with an ``No such file or
directory'' error.
-The crop file names will consist of two parts: a variable part (the row number
of each target starting from 1) along with a fixed string which you can set
with the @option{--suffix} option.
-Optionally, you may also use the @option{--namecol} option to define a column
in the input catalog to use as the file name instead of numbers.
-@item
-When only one crop is desired, the value to @option{--output} will be read as
a file name.
-If no output is specified or if it is a directory, the output file name will
follow the automatic output names of Gnuastro, see @ref{Automatic output}: The
string given to @option{--suffix} will be replaced with the @file{.fits} suffix
of the input.
-@end itemize
-By default, as suggested by the FITS standard and implemented in all Gnuastro
programs, the first/primary extension of the output files will only contain
metadata.
-The cropped images/cubes will be written into the 2nd HDU of their respective
FITS file (which is actually counted as @code{1} because HDU counting starts
from @code{0}).
-However, if you want the cropped data to be written into the primary (0-th)
HDU, run Crop with the @option{--primaryimghdu} option.
-If the output file already exists by default Crop will re-write it (so that
all existing HDUs in it will be deleted).
-If you want the cropped HDU to be appended to existing HDUs, use
@option{--append} described below.
-The header of each output cropped image will contain the names of the input
image(s) it was cut from.
-If a name is longer than the 70 character space that the FITS standard allows
for header keyword values, the name will be cut into several keywords from the
nearest slash (@key{/}).
-The keywords have the following format: @command{ICFn_m} (for Crop File).
-Where @command{n} is the number of the image used in this crop and @command{m}
is the part of the name (it can be broken into multiple keywords).
-Following the name is another keyword named @command{ICFnPIX} which shows the
pixel range from that input image in the same syntax as @ref{Crop section
syntax}.
-So this string can be directly given to the @option{--section} option later.
-Once done, a log file can be created in the current directory with the
@code{--log} option.
-This file will have three columns and the same number of rows as the number of
cropped images.
-There are also comments on the top of the log file explaining basic
information about the run and descriptions for the columns.
-A short description of the columns is also given below:
-@enumerate
-@item
-The cropped image file name for that row.
-@item
-The number of input images that were used to create that image.
-@item
-A @code{0} if the central few pixels (value to the @option{--checkcenter}
option) are blank and @code{1} if they are not.
-When the crop was not defined by its center (see @ref{Crop modes}), or
@option{--checkcenter} was given a value of 0 (see @ref{Invoking astcrop}), the
center will not be checked and this column will be given a value of @code{-1}.
-@end enumerate
-
-If the output crop(s) have a single element (pixel in an image) and
@option{--oneelemstdout} has been called, no output file will be produced!
-Instead, the single element's value is printed on the standard output.
-See the description of @option{--oneelemstdout} below for more:
-
-@table @option
-@item -p STR
-@itemx --suffix=STR
-The suffix (or post-fix) of the output files for when you want all the cropped
images to have a special ending.
-One case where this might be helpful is when besides the science images, you
want the weight images (or exposure maps, which are also distributed with
survey images) of the cropped regions too.
-So in one run, you can set the input images to the science images and
@option{--suffix=_s.fits}.
-In the next run you can set the weight images as input and
@option{--suffix=_w.fits}.
-
-@item -a STR
-@itemx --metaname=STR
-Name of cropped HDU (value to the @code{EXTNAME} keyword of FITS).
-If not given, a default @code{CROP} will be placed there (so the
@code{EXTNAME} keyword will always be present in the output).
-If crop produces many outputs from a catalog, they will be given the same
string as @code{EXTNAME} (the file names containing the cropped HDU will be
different).
-
-@item -A
-@itemx --append
-If the output file already exists, append the cropped image HDU to the end of
any existing HDUs.
-By default (when this option isn't given), if an output file already exists,
any existing HDU in it will be deleted.
-If the output file doesn't exist, this option is redundant.
-@item --primaryimghdu
-Write the output into the primary (0-th) HDU/extension of the output.
-By default, like all Gnuastro's default outputs, no data is written in the
primary extension because the FITS standard suggests keeping that extension
free of data and only for metadata.
-@item -t
-@itemx --oneelemstdout
-When a crop only has a single element (a single pixel), print it to the
standard output instead of making a file.
-By default (without this option), a single-pixel crop will be saved to a file,
just like a crop of any other size.
-When a single crop is requested (either through @option{--center}, or a
catalog of one row is given), the single value alone is printed with nothing
else.
-This makes it easy to immediately write the value into a shell variable for
example:
-@example
-value=$(astcrop img.fits --mode=wcs --center=1.234,5.678 \
- --width=1 --widthinpix --oneelemstdout \
- --quiet)
-@end example
-If a catalog of coordinates is given (that would produce multiple crops; or
multiple values in this scenario), the solution for a single value will not
work!
-Recall that Crop will do the crops in parallel, therefore each time you run
it, the order of the rows will be different and not correspond to the order of
the inputs.
-To allow identification of each value (which row of the input catalog it
corresponds to), Crop will first print the name of the would-be created file
name, and print the value after it (separated by an empty SPACE character).
-In other words, the file in the first column will not actually be created, but
the value of the pixel it would have contained (if this option was not called)
is printed after it.
-@item -c FLT/INT
-@itemx --checkcenter=FLT/INT
-@cindex Check center of crop
-Square box width of region in the center of the image to check for blank
values.
-If any of the pixels in this central region of a crop (defined by its center)
are blank, then it will not be stored in an output file.
-If the value to this option is zero, no checking is done.
-This check is only applied when the cropped region(s) are defined by their
center (not by the vertices, see @ref{Crop modes}).
+@node Data manipulation, Data analysis, Data containers, Top
+@chapter Data manipulation
-The units of the value are interpreted based on the @option{--mode} value (in
WCS or pixel units).
-The ultimate checked region size (in pixels) will be an odd integer around the
center (converted from WCS, or when an even number of pixels are given to this
option).
-In WCS mode, the value can be given as fractions, for example, if the WCS
units are in degrees, @code{0.1/3600} will correspond to a check size of 0.1
arcseconds.
+Images are one of the major formats of data that is used in astronomy.
+The functions in this chapter explain the GNU Astronomy Utilities which are
provided for their manipulation.
+For example, cropping out a part of a larger image or convolving the image
with a given kernel or applying a transformation to it.
-Because survey regions do not often have a clean square or rectangle shape,
some of the pixels on the sides of the survey FITS image do not commonly have
any data and are blank (see @ref{Blank pixels}).
-So when the catalog was not generated from the input image, it often happens
that the image does not have data over some of the points.
+@menu
+* Crop:: Crop region(s) from a dataset.
+* Arithmetic:: Arithmetic on input data.
+* Convolve:: Convolve an image with a kernel.
+* Warp:: Warp/Transform an image to a different grid.
+@end menu
-When the given center of a crop falls in such regions or outside the dataset,
and this option has a non-zero value, no crop will be created.
-Therefore with this option, you can specify a width of a small box (3 pixels
is often good enough) around the central pixel of the cropped image.
-You can check which crops were created and which were not from the
command-line (if @option{--quiet} was not called, see @ref{Operating mode
options}), or in Crop's log file (see @ref{Crop output}).
+@node Crop, Arithmetic, Data manipulation, Data manipulation
+@section Crop
-@item -b
-@itemx --noblank
-Pixels outside of the input image that are in the crop box will not be used.
-By default they are filled with blank values (depending on type), see
@ref{Blank pixels}.
-This option only applies only in Image mode, see @ref{Crop modes}.
+@cindex Section of an image
+@cindex Crop part of image
+@cindex Postage stamp images
+@cindex Large astronomical images
+@pindex @r{Crop (}astcrop@r{)}
+Astronomical images are often very large, filled with thousands of galaxies.
+It often happens that you only want a section of the image, or you have a
catalog of sources and you want to visually analyze them in small postage
stamps.
+Crop is made to do all these things.
+When more than one crop is required, Crop will divide the crops between
multiple threads to significantly reduce the run time.
-@item -z
-@itemx --zeroisnotblank
-In float or double images, it is common to give the value of zero to blank
pixels.
-If the input image type is one of these two types, such pixels will also be
considered as blank.
-You can disable this behavior with this option, see @ref{Blank pixels}.
-@end table
+@cindex Mosaicing
+@cindex Image tiles
+@cindex Image mosaic
+@cindex COSMOS survey
+@cindex Imaging surveys
+@cindex Hubble Space Telescope (HST)
+Astronomical surveys are usually extremely large.
+So large in fact, that the whole survey will not fit into a reasonably sized
file.
+Because of this, surveys usually cut the final image into separate tiles and
store each tile in a file.
+For example, the COSMOS survey's Hubble space telescope, ACS F814W image
consists of 81 separate FITS images, with each one having a volume of 1.7
Gigabytes.
+@cindex Stitch multiple images
+Even though the tile sizes are chosen to be large enough that too many
galaxies/targets do not fall on the edges of the tiles, inevitably some do.
+So when you simply crop the image of such targets from one tile, you will miss
a large area of the surrounding sky (which is essential in estimating the
noise).
+Therefore in its WCS mode, Crop will stitch parts of the tiles that are
relevant for a target (with the given width) from all the input images that
cover that region into the output.
+Of course, the tiles have to be present in the list of input files.
-@node Crop known issues, , Crop output, Invoking astcrop
-@subsubsection Crop known issues
+Besides cropping postage stamps around certain coordinates, Crop can also crop
arbitrary polygons from an image (or a set of tiles by stitching the relevant
parts of different tiles within the polygon), see @option{--polygon} in
@ref{Invoking astcrop}.
+Alternatively, it can crop out rectangular regions through the
@option{--section} option from one image, see @ref{Crop section syntax}.
-When running Crop, you may encounter strange errors and bugs.
-In these cases, please report a bug and we will try to fix it as soon as
possible, see @ref{Report a bug}.
-However, some things are beyond our control, or may take too long to fix
directly.
-In this section we list such known issues that may occur in known cases and
suggest the hack (or work-around) to fix the problem:
+@menu
+* Crop modes:: Basic modes to define crop region.
+* Crop section syntax:: How to define a section to crop.
+* Blank pixels:: Pixels with no value.
+* Invoking astcrop:: Calling Crop on the command-line
+@end menu
-@table @asis
-@item Crash with @samp{Killed} when cropping catalog from @file{.fits.gz}
-This happens because CFISTIO (that reads and writes FITS files) will
internally decompress the file in a temporary place (possibly in the RAM), then
start reading from it.
-On the other hand, by default when given a catalog (with many crops) and not
specifying @option{--numthreads}, Crop will use the maximum number of threads
available on your system to do each crop faster.
-On an normal (not compressed) file, parallel access will not cause a problem,
however, when attempting parallel access with the maximum number of threads on
a compressed file, CFITSIO crashes with @code{Killed}.
-Therefore the following solutions can be used to fix this crash:
+@node Crop modes, Crop section syntax, Crop, Crop
+@subsection Crop modes
+In order to be comprehensive, intuitive, and easy to use, there are two ways
to define the crop:
-@itemize
+@enumerate
@item
-Decrease the number of threads (at the minimum, set @option{--numthreads=1}).
-Since this solution does not attempt to change any of your previous Crop
command components or does not change your local file structure, it is the
preferred way.
+From its center and side length.
+For example, if you already know the coordinates of an object and want to
inspect it in an image or to generate postage stamps of a catalog containing
many such coordinates.
@item
-Decompress the file (with the command below) and feed the @file{.fits} file
into Crop without changing the number of threads.
+The vertices of the crop region, this can be useful for larger crops over
+many targets, for example, to crop out a uniformly deep, or contiguous,
+region of a large survey.
+@end enumerate
-@example
-$ gunzip -k image.fits.gz
-@end example
-@end itemize
-@end table
+Irrespective of how the crop region is defined, the coordinates to define the
crop can be in Image (pixel) or World Coordinate System (WCS) standards.
+All coordinates are read as floating point numbers (not integers, except for
the @option{--section} option, see below).
+By setting the @emph{mode} in Crop, you define the standard that the given
coordinates must be interpreted.
+Here, the different ways to specify the crop region are discussed within each
standard.
+For the full list options, please see @ref{Invoking astcrop}.
+When the crop is defined by its center, the respective (integer) central pixel
position will be found internally according to the FITS standard.
+To have this pixel positioned in the center of the cropped region, the final
cropped region will have an add number of pixels (even if you give an even
number to @option{--width} in image mode).
+Furthermore, when the crop is defined as by its center, Crop allows you to
only keep crops what do not have any blank pixels in the vicinity of their
center (your primary target).
+This can be very convenient when your input catalog/coordinates originated
from another survey/filter which is not fully covered by your input image, to
learn more about this feature, please see the description of the
@option{--checkcenter} option in @ref{Invoking astcrop}.
+@table @asis
+@item Image coordinates
+In image mode (@option{--mode=img}), Crop interprets the pixel coordinates and
widths in units of the input data-elements (for example, pixels in an image,
not world coordinates).
+In image mode, only one image may be input.
+The output crop(s) can be defined in multiple ways as listed below.
+@table @asis
+@item Center of multiple crops (in a catalog)
+The center of (possibly multiple) crops are read from a text file.
+In this mode, the columns identified with the @option{--coordcol} option are
interpreted as the center of a crop with a width of @option{--width} pixels
along each dimension.
+The columns can contain any floating point value.
+The value to @option{--output} option is seen as a directory which will host
(the possibly multiple) separate crop files, see @ref{Crop output} for more.
+For a tutorial using this feature, please see @ref{Reddest clumps cutouts and
parallelization}.
+@item Center of a single crop (on the command-line)
+The center of the crop is given on the command-line with the @option{--center}
option.
+The crop width is specified by the @option{--width} option along each
dimension.
+The given coordinates and width can be any floating point number.
+@item Vertices of a single crop
+In Image mode there are two options to define the vertices of a region to
crop: @option{--section} and @option{--polygon}.
+The former is lower-level (does not accept floating point vertices, and only a
rectangular region can be defined), it is also only available in Image mode.
+Please see @ref{Crop section syntax} for a full description of this method.
+The latter option (@option{--polygon}) is a higher-level method to define any
polygon (with any number of vertices) with floating point values.
+Please see the description of this option in @ref{Invoking astcrop} for its
syntax.
+@end table
+@item WCS coordinates
+In WCS mode (@option{--mode=wcs}), the coordinates and width are interpreted
using the World Coordinate System (WCS, that must accompany the dataset), not
pixel coordinates.
+You can optionally use @option{--widthinpix} for the width to be interpreted
in pixels (even though the coordinates are in WCS).
+In WCS mode, Crop accepts multiple datasets as input.
+When the cropped region (defined by its center or vertices) overlaps with
multiple of the input images/tiles, the overlapping regions will be taken from
the respective input (they will be stitched when necessary for each output
crop).
+In this mode, the input images do not necessarily have to be the same size,
they just need to have the same orientation and pixel resolution.
+Currently only orientation along the celestial coordinates is accepted, if
your input has a different orientation or resolution you can use Warp's
@option{--gridfile} option to align the image before cropping it (see
@ref{Warp}).
+Each individual input image/tile can even be smaller than the final crop.
+In any case, any part of any of the input images which overlaps with the
desired region will be used in the crop.
+Note that if there is an overlap in the input images/tiles, the pixels from
the last input image read are going to be used for the overlap.
+Crop will not change pixel values, so it assumes your overlapping tiles were
cutout from the same original image.
+There are multiple ways to define your cropped region as listed below.
+@table @asis
+@item Center of multiple crops (in a catalog)
+Similar to catalog inputs in Image mode (above), except that the values along
each dimension are assumed to have the same units as the dataset's WCS
information.
+For example, the central RA and Dec value for each crop will be read from the
first and second calls to the @option{--coordcol} option.
+The width of the cropped box (in units of the WCS, or degrees in RA and Dec
mode) must be specified with the @option{--width} option.
+You can optionally use @option{--widthinpix} for the value of @option{--width}
to be interpreted in pixels.
+@item Center of a single crop (on the command-line)
+You can specify the center of only one crop box with the @option{--center}
option.
+If it exists in the input images, it will be cropped similar to the catalog
mode, see above also for @code{--width}.
-@node Arithmetic, Convolve, Crop, Data manipulation
-@section Arithmetic
+@item Vertices of a single crop
+The @option{--polygon} option is a high-level method to define any convex
polygon (with any number of vertices).
+Please see the description of this option in @ref{Invoking astcrop} for its
syntax.
+@end table
-It is commonly necessary to do operations on some or all of the elements of a
dataset independently (pixels in an image).
-For example, in the reduction of raw data it is necessary to subtract the Sky
value (@ref{Sky value}) from each image image.
-Later (once the images as warped into a single grid using Warp for example,
see @ref{Warp}), the images are co-added (the output pixel grid is the average
of the pixels of the individual input images).
-Arithmetic is Gnuastro's program for such operations on your datasets directly
from the command-line.
-It currently uses the reverse polish or post-fix notation, see @ref{Reverse
polish notation} and will work on the native data types of the input
images/data to reduce CPU and RAM resources, see @ref{Numeric data types}.
-For more information on how to run Arithmetic, please see @ref{Invoking
astarithmetic}.
+@cartouche
+@noindent
+@strong{CAUTION:} In WCS mode, the image has to be aligned with the celestial
coordinates, such that the first FITS axis is parallel (opposite direction) to
the Right Ascension (RA) and the second FITS axis is parallel to the
declination.
+If these conditions are not met for an image, Crop will warn you and abort.
+You can use Warp to align the input image to standard celestial coordinates,
see @ref{Warp}.
+@end cartouche
+@end table
-@menu
-* Reverse polish notation:: The current notation style for Arithmetic
-* Integer benefits and pitfalls:: Integers have major benefits, but require
care
-* Arithmetic operators:: List of operators known to Arithmetic
-* Invoking astarithmetic:: How to run Arithmetic: options and output
-@end menu
+As a summary, if you do not specify a catalog, you have to define the cropped
region manually on the command-line.
+In any case the mode is mandatory for Crop to be able to interpret the values
given as coordinates or widths.
-@node Reverse polish notation, Integer benefits and pitfalls, Arithmetic,
Arithmetic
-@subsection Reverse polish notation
-@cindex Post-fix notation
-@cindex Reverse Polish Notation
-The most common notation for arithmetic operations is the
@url{https://en.wikipedia.org/wiki/Infix_notation, infix notation} where the
operator goes between the two operands, for example, @mymath{4+5}.
-The infix notation is the preferred way in most programming languages which
come with scripting features for large programs.
-This is because the infix notation requires a way to define precedence when
more than one operator is involved.
+@node Crop section syntax, Blank pixels, Crop modes, Crop
+@subsection Crop section syntax
-For example, consider the statement @code{5 + 6 / 2}.
-Should 6 first be divided by 2, then added by 5?
-Or should 5 first be added with 6, then divided by 2?
-Therefore we need parenthesis to show precedence: @code{5+(6/2)} or
@code{(5+6)/2}.
-Furthermore, if you need to leave a value for later processing, you will need
to define a variable for it; for example, @code{a=(5+6)/2}.
+@cindex Crop a given section of image
+When in image mode, one of the methods to crop only one rectangular section
from the input image is to use the @option{--section} option.
+Crop has a powerful syntax to read the box parameters from a string of
characters.
+If you leave certain parts of the string to be empty, Crop can fill them for
you based on the input image sizes.
-Gnuastro provides libraries where you can also use infix notation in C or C++
programs.
-However, Gnuastro's programs are primarily designed to be run on the
command-line and the level of complexity that infix notation requires can be
annoying/confusing to write on the command-line (where they can get confused
with the shell's parenthesis or variable definitions).
-Therefore Gnuastro's Arithmetic and Table (when doing column arithmetic)
programs use the post-fix notation, also known as
@url{https://en.wikipedia.org/wiki/Reverse_Polish_notation, reverse polish
notation}.
-For example, instead of writing @command{5+6}, we write @command{5 6 +}.
+@cindex Define section to crop
+To define a box, you need the coordinates of two points: the first (@code{X1},
@code{Y1}) and the last pixel (@code{X2}, @code{Y2}) pixel positions in the
image, or four integer numbers in total.
+The four coordinates can be specified with one string in this format:
`@command{X1:X2,Y1:Y2}'.
+This string is given to the @option{--section} option.
+Therefore, the pixels along the first axis that are @mymath{\geq}@command{X1}
and @mymath{\leq}@command{X2} will be included in the cropped image.
+The same goes for the second axis.
+Note that each different term will be read as an integer, not a float.
-The Wikipedia article on the reverse polish notation provides some excellent
explanation on this notation but here we will give a short summary here for
self-sufficiency.
-In short, in the reverse polish notation, the operator is placed after the
operands.
-As we will see below this removes the need to define parenthesis and lets you
use previous values without needing to define a variable.
-In the future@footnote{@url{https://savannah.gnu.org/task/index.php?13867}} we
do plan to also optionally allow infix notation when arithmetic operations on
datasets are desired, but due to time constraints on the developers we cannot
do it immediately.
+The reason it only accepts integers is that @option{--section} is a low-level
option (which is also very fast!).
+For a higher-level way to specify region (any polygon, not just a box), please
see the @option{--polygon} option in @ref{Crop options}.
+Also note that in the FITS standard, pixel indexes along each axis start from
unity(1) not zero(0).
-To easily understand how the reverse polish notation works, you can think of
each operand (@code{5} and @code{6} in the example above) as a node in a
``last-in-first-out'' stack.
-One such stack in daily life is a stack of dishes in the kitchen: you put a
clean dish, on the top of a stack of dishes when it is ready for later usage.
-Later, when you need a dish, you pick the top one (hence the ``last'' dish
placed ``in'' the stack is the ``first'' dish that comes ``out'' when
necessary).
+@cindex Crop section format
+You can omit any of the values and they will be filled automatically.
+The left hand side of the colon (@command{:}) will be filled with @command{1},
and the right side with the image size.
+So, @command{2:,:} will include the full range of pixels along the second axis
and only those with a first axis index larger than @command{2} in the first
axis.
+If the colon is omitted for a dimension, then the full range is automatically
used.
+So the same string is also equal to @command{2:,} or @command{2:} or even
@command{2}.
+If you want such a case for the second axis, you should set it to:
@command{,2}.
-Each operator will need a certain number of operands (in the example above,
the @code{+} operator needs two operands: @code{5} and @code{6}).
-In the kitchen metaphor, an operator can be an oven.
-Every time an operator is confronted, the operator takes (or ``pops'') the
number of operands it needs from the top of the stack (so they do not exist in
the stack any more), does its operation, and places (or ``pushes'') the result
back on top of the stack.
-So if you want the average of 5 and 6, you would write: @command{5 6 + 2 /}.
-The operations that are done are:
+If you specify a negative value, it will be seen as before the indexes of the
image which are outside the image along the bottom or left sides when viewed in
SAO DS9.
+In case you want to count from the top or right sides of the image, you can
use an asterisk (@option{*}).
+When confronted with a @option{*}, Crop will replace it with the maximum
length of the image in that dimension.
+So @command{*-10:*+10,*-20:*+20} will mean that the crop box will be
@math{20\times40} pixels in size and only include the top corner of the input
image with 3/4 of the image being covered by blank pixels, see @ref{Blank
pixels}.
-@enumerate
-@item
-@command{5} is an operand, so Arithmetic pushes it to the top of the stack
(which is initially empty).
-In the kitchen metaphor, you can visualize this as taking a new dish from the
cabinet, putting the number 5 inside of the dish, and putting the dish on top
of the (empty) cooking table in front of you.
-You now have a stack of one dish on the table in front of you.
-@item
-@command{6} is also an operand, so it is pushed to the top of the stack.
-Like before, you can visualize this as taking a new dish from the cabinet,
putting the number 6 in it and placing it on top of the previous dish.
-You now have a stack of two dishes on the table in front of you.
-@item
-@command{+} is a @emph{binary} operator, so it will pop the top two elements
of the stack out of it, and perform addition on them (the order is @mymath{5+6}
in the example above).
-The result is @command{11} which is pushed to the top of the stack.
+If you feel more comfortable with space characters between the values, you can
use as many space characters as you wish, just be careful to put your value in
double quotes, for example, @command{--section="5:200, 123:854"}.
+If you forget the quotes, anything after the first space will not be seen by
@option{--section} and you will most probably get an error because the rest of
your string will be read as a filename (which most probably does not exist).
+See @ref{Command-line} for a description of how the command-line works.
-To visualize this, you can think of the @code{+} operator as an oven with a
place for two dishes.
-You pick up the top-most dish (that has the number 6 in it) and put it in the
oven.
-The top dish is now the one that has the number 5.
-You also pick it up and put it in the oven, and close the oven door.
-When the oven has finished its cooking, it produces a single output (in one
dish, with the number 11 inside of it).
-You take that output dish and put it back on the table.
-You now have a stack of one dish on the table in front of you.
-@item
-@command{2} is an operand so push it onto the top of the stack.
-In the kitchen metaphor, you again go to the cabinet, pick up a dish and put
the number 2 inside of it and put the dish over the previous dish (that has the
number 11).
-You now have a stack of two dishes on the table in front of you.
-@item
-@command{/} (division) is a binary operator, so pull out the top two elements
of the stack (top-most is @command{2}, then @command{11}) and divide the second
one by the first.
-In the kitchen metaphor, the @command{/} operator can be visualized as a
microwave that takes two dishes.
-But unlike the oven (@code{+} operator) before, the order of inputs matters
(they are on top of each other: with the top dish holder being the nominator
and the bottom one being the denominator).
-Again, you look to your stack of dishes on the table.
-You pick up the top one (with value 2 inside of it) and put it in the
microwave's bottom (denominator) dish holder.
-Then you go back to your stack of dishes on the table and pick up the top dish
(with value 11 inside of it) and put that in the top (nominator) dish holder.
-The microwave will do its work and when it is finished, returns a new dish
with the single value 5.5 inside of it.
-You pick up the dish from the microwave and place it back on the table.
+@node Blank pixels, Invoking astcrop, Crop section syntax, Crop
+@subsection Blank pixels
+@cindex Blank pixel
+The cropped box can potentially include pixels that are beyond the image range.
+For example, when a target in the input catalog was very near the edge of the
input image.
+The parts of the cropped image that were not in the input image will be filled
with the following two values depending on the data type of the image.
+In both cases, SAO DS9 will not color code those pixels.
+@itemize
@item
-There are no more operands or operators, so simply return the remaining
operand in the output.
-In the kitchen metaphor, you see that your recipe has no more steps, so you
just pick up the remaining dish and take it to the dining room to enjoy a good
dinner.
-@end enumerate
-
-In the Arithmetic program, the operands can be FITS images of any
dimensionality, or numbers (see @ref{Invoking astarithmetic}).
-In Table's column arithmetic, they can be any column in the table (a series of
numbers in an array) or a single number (see @ref{Column arithmetic}).
-
-With this notation, very complicated procedures can be created without the
need for parenthesis or worrying about precedence.
-Even functions which take an arbitrary number of arguments can be defined in
this notation.
-This is a very powerful notation and is used in languages like Postscript
@footnote{See the EPS and PDF part of @ref{Recognized file formats} for a
little more on the Postscript language.} which produces PDF files when compiled.
-
+If the data type of the image is a floating point type (float or double), IEEE
NaN (Not a number) will be used.
+@item
+For integer types, pixels out of the image will be filled with the value of
the @command{BLANK} keyword in the cropped image header.
+The value assigned to it is the lowest value possible for that type, so you
will probably never need it any way.
+Only for the unsigned character type (@command{BITPIX=8} in the FITS header),
the maximum value is used because it is unsigned, the smallest value is zero
which is often meaningful.
+@end itemize
+You can ask for such blank regions to not be included in the output crop image
using the @option{--noblank} option.
+In such cases, there is no guarantee that the image size of your outputs are
what you asked for.
+In some survey images, unfortunately they do not use the @command{BLANK} FITS
keyword.
+Instead they just give all pixels outside of the survey area a value of zero.
+So by default, when dealing with float or double image types, any values that
are 0.0 are also regarded as blank regions.
+This can be turned off with the @option{--zeroisnotblank} option.
-@node Integer benefits and pitfalls, Arithmetic operators, Reverse polish
notation, Arithmetic
-@subsection Integer benefits and pitfalls
+@node Invoking astcrop, , Blank pixels, Crop
+@subsection Invoking Crop
-Integers are the simplest numerical data types (@ref{Numeric data types}).
-Because of this, their storage space is much less, and their processing is
much faster than floating point types.
-You can confirm this on your computer with the series of commands below.
-You will make four 5000 by 5000 pixel images filled with random values.
-Two of them will be saved as signed 8-bit integers, and two with 64-bit
floating point types.
-The last command prints the size of the created images.
+Crop will crop a region from an image.
+If in WCS mode, it will also stitch parts from separate images in the input
files.
+The executable name is @file{astcrop} with the following general template
@example
-$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-1.fits
-$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-2.fits
-$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-1.fits
-$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-2.fits
-$ ls -lh int-*.fits flt-*.fits
+$ astcrop [OPTION...] [ASCIIcatalog] ASTRdata ...
@end example
-The 8-bit integer images are only 24MB, while the 64-bit floating point images
are 191 MB!
-Besides helping in storage (on your disk, or in RAM, while the program is
running), the small size of these files also helps in faster reading of the
inputs.
-Furthermore, CPUs can process integer operations much faster than floating
points.
-In the integers, the ones with a smaller width (number of bits) can be
processed much faster. You can see this with the two commands below where you
will add the integer images with each other and the floats with each other:
+
+@noindent
+One line examples:
@example
-$ astarithmetic flt-1.fits flt-2.fits + -oflt-sum.fits -g1
-$ astarithmetic int-1.fits int-2.fits + -oint-sum.fits -g1
-@end example
+## Crop all objects in cat.txt from image.fits:
+$ astcrop --catalog=cat.txt image.fits
-Have a look at the running time of the two commands above (that is printed on
their last line).
-On the system that this paragraph was written on, the floating point and
integer image sums were respectively done in 0.481 and 0.089 seconds (the
integer operation was almost 5 times faster!).
+## Crop all options in catalog (with RA,DEC) from all the files
+## ending in `_drz.fits' in `/mnt/data/COSMOS/':
+$ astcrop --mode=wcs --catalog=cat.txt /mnt/data/COSMOS/*_drz.fits
-@cartouche
-@noindent
-@strong{If your data does not have decimal points, use integer types:} integer
types are much faster and can take much less space in your storage or RAM
(while the program is running).
-@end cartouche
+## Crop the outer 10 border pixels of the input image and give
+## the output HDU a name ('EXTNAME' keyword in FITS) of 'mysection'.
+$ astcrop --section=10:*-10,10:*-10 --hdu=2 image.fits \
+ --metaname=mysection
-@cartouche
-@noindent
-@strong{Select the smallest width that can host the range/precision of
values:} for example, if the largest possible value in your dataset is 1000 and
all numbers are integers, store it as a 16-bit integer.
-Also, if you know the values can never become negative, store it as an
unsigned 16-bit integer.
-For floating point types, if you know you will not need a precision of more
than 6 significant digits, use the 32-bit floating point type.
-For more on the range (for integers) and precision (for floats), see
@ref{Numeric data types}.
-@end cartouche
+## Crop region around RA and Dec of (189.16704, 62.218203):
+$ astcrop --mode=wcs --center=189.16704,62.218203 goodsnorth.fits
-There is a price to be paid for this improved efficiency in integers: your
wisdom!
-If you have not selected your types wisely, strange situations may happen.
-For example, try the command below:
+## Same crop above, but coordinates given in sexagesimal (you can
+## also use ':' between the sexagesimal components).
+$ astcrop --mode=wcs --center=12h36m40.08,62d13m5.53 goodsnorth.fits
-@example
-$ astarithmetic 125 10 +
+## Crop region around pixel coordinate (568.342, 2091.719):
+$ astcrop --mode=img --center=568.342,2091.719 --width=201 image.fits
+
+## Crop all HDUs within a FITS file at a certain coordinate, while
+## preserving the names of the HDUs in the output.
+$ for hdu in $(astfits input.fits --listimagehdus); do \
+ astcrop input.fits --hdu=$hdu --append --output=crop.fits \
+ --metaname=$hdu --mode=wcs --center=189.16704,62.218203 \
+ --width=10/3600
+ done
@end example
-@cindex Integer overflow
-@cindex Overflow, integer
@noindent
-You expect the output to be @mymath{135}, but it will be @mymath{-121}!
-The reason is that when Arithmetic (or column-arithmetic in Table) confronts a
number on the command-line, it use the principles above to select the most
efficient type for each number.
-Both @mymath{125} and @mymath{10} can safely fit within a signed, 8-bit
integer type, so arithmetic will store both as an 8-bit integer.
-However, the sum (@mymath{135}) is larger than the maximum possible value of
an 8-bit signed integer (@mymath{127}).
-Therefore an integer overflow will occur, and the bits will be over-written.
-As a result, the value will be @mymath{135-128=7} more than the minimum value
of this type (@mymath{-128}), which is @mymath{-128+7=-121}.
-
-When you know situations like this may occur, you can simply use
@ref{Numerical type conversion operators}, to set just one of the inputs to a
wider data type (the smallest, wider type to avoid wasting resources).
-In the example above, this would be @code{uint16}:
-
-@example
-$ astarithmetic 125 uint16 10 +
-@end example
+Crop has one mandatory argument which is the input image name(s), shown above
with @file{ASTRdata ...}.
+You can use shell expansions, for example, @command{*} for this if you have
lots of images in WCS mode.
+If the crop box centers are in a catalog, you can use the @option{--catalog}
option.
+In other cases, you have to provide the single cropped output parameters must
be given with command-line options.
+See @ref{Crop output} for how the output file name(s) can be specified.
+For the full list of general options to all Gnuastro programs (including
Crop), please see @ref{Common options}.
-The reason this worked is that @mymath{125} is now converted into an unsigned
16-bit integer before the @code{+} operator.
-Since this is larger than an 8-bit integer, the C programming language's
automatic type conversion will treat both as the wider type and store the
result of the binary operation (@code{+}) in that type.
+Floating point numbers can be used to specify the crop region (except the
@option{--section} option, see @ref{Crop section syntax}).
+In such cases, the floating point values will be used to find the desired
integer pixel indices based on the FITS standard.
+Hence, Crop ultimately does not do any sub-pixel cropping (in other words, it
does not change pixel values).
+If you need such crops, you can use @ref{Warp} to first warp the image to the
a new pixel grid, then crop from that.
+For example, let's assume you want a crop from pixels 12.982 to 80.982 along
the first dimension.
+You should first translate the image by @mymath{-0.482} (note that the edge of
a pixel is at integer multiples of @mymath{0.5}).
+So you should run Warp with @option{--translate=-0.482,0} and then crop the
warped image with @option{--section=13:81}.
-For such a basic operation like the command above, a faster hack would be any
of the two commands below (which are equivalent).
-This is because @code{125.0} or @code{125.} are interpreted as floating-point
types and they do not suffer from such issues (converting only on one input is
enough):
+There are two ways to define the cropped region: with its center or its
vertices.
+See @ref{Crop modes} for a full description.
+In the former case, Crop can check if the central region of the cropped image
is indeed filled with data or is blank (see @ref{Blank pixels}), and not
produce any output when the center is blank, see the description under
@option{--checkcenter} for more.
-@example
-$ astarithmetic 125. 10 +
-$ astarithmetic 125.0 10 +
-@end example
+@cindex Asynchronous thread allocation
+When in catalog mode, Crop will run in parallel unless you set
@option{--numthreads=1}, see @ref{Multi-threaded operations}.
+Note that when multiple outputs are created with threads, the outputs will not
be created in the same order.
+This is because the threads are asynchronous and thus not started in order.
+This has no effect on each output, see @ref{Reddest clumps cutouts and
parallelization} for a tutorial on effectively using this feature.
-For this particular command, the fix above will be as fast as the
@code{uint16} solution.
-This is because there are only two numbers, and the overhead of Arithmetic
(reading configuration files, etc.) dominates the running time.
-However, for large datasets, the @code{uint16} solution will be faster (as you
saw above), Arithmetic will consume less RAM while running, and the output will
consume less storage in your system (all major benefits)!
+@menu
+* Crop options:: A list of all the options with explanation.
+* Crop output:: The outputs of Crop.
+* Crop known issues:: Known issues in running Crop.
+@end menu
-It is possible to do internal checks in Gnuastro and catch integer overflows
and correct them internally.
-However, we have not opted for this solution because all those checks will
consume significant resources and slow down the program (especially with large
datasets where RAM, storage and running time become important).
-To be optimal, we therefore trust that you (the wise Gnuastro user!) make the
appropriate type conversion in your commands where necessary (recall that the
operators are available in @ref{Numerical type conversion operators}).
+@node Crop options, Crop output, Invoking astcrop, Invoking astcrop
+@subsubsection Crop options
-@node Arithmetic operators, Invoking astarithmetic, Integer benefits and
pitfalls, Arithmetic
-@subsection Arithmetic operators
+The options can be classified into the following contexts: Input, Output and
operating mode options.
+Options that are common to all Gnuastro program are listed in @ref{Common
options} and will not be repeated here.
-In this section, list of recognized operators in Arithmetic (and the Table
program's @ref{Column arithmetic}) and discussed in detail with examples.
-As mentioned before, to be able to easily do complex operations on the
command-line, the Reverse Polish Notation is used (where you write
`@mymath{4\quad5\quad+}' instead of `@mymath{4 + 5}'), if you are not already
familiar with it, before continuing, please see @ref{Reverse polish notation}.
+When you are specifying the crop vertices yourself (through
@option{--section}, or @option{--polygon}) on relatively small regions
(depending on the resolution of your images) the outputs from image and WCS
mode can be approximately equivalent.
+However, as the crop sizes get large, the curved nature of the WCS coordinates
have to be considered.
+For example, when using @option{--section}, the right ascension of the bottom
left and top left corners will not be equal.
+If you only want regions within a given right ascension, use
@option{--polygon} in WCS mode.
-The operands to all operators can be a data array (for example, a FITS image
or data cube) or a number, the output will be an array or number according to
the inputs.
-For example, a number multiplied by an array will produce an array.
-The numerical data type of the output of each operator is described within it.
-Here are some generic tips and tricks (relevant to all operators):
+@noindent
+Input image parameters:
+@table @option
-@table @asis
-@item Multiple operators in one command
-When you need to use arithmetic commands in several consecutive operations,
you can use one command instead of multiple commands and perform all
calculations in the same command.
-For example, assume you want to apply a threshold of 10 on your image, and
label the connected groups of pixel above this threshold.
-You need two operators for this: @code{gt} (for ``greater than'', see
@ref{Conditional operators}) and @code{connected-components} (see
@ref{Mathematical morphology operators}).
-The bad (non-optimized and slow) way of doing this is to call Arithmetic two
times:
-@example
-$ astarithmetic image.fits 10 gt --output=thresh.fits
-$ astarithmetic thresh.fits 2 connected-components \
- --output=labeled.fits
-$ rm thresh.fits
-@end example
+@item --hstartwcs=INT
+Specify the first keyword card (line number) to start finding the input image
world coordinate system information.
+This is useful when certain header keywords of the input may cause bad
conflicts with your crop (see an example described below).
+To get line numbers of the header keywords, you can pipe the fully printed
header into @command{cat -n} like below:
-The good (optimal) way is to call them after each other (remember @ref{Reverse
polish notation}):
@example
-$ astarithmetic image.fits 10 gt 2 connected-components \
- --output=labeled.fits
+$ astfits image.fits -h1 | cat -n
@end example
-You can similarly add any number of operations that must be done sequentially
in a single command and benefit from the speed and lack of intermediate files.
-When your commands become long, you can use the @code{set-AAA} operator to
make it more readable, see @ref{Operand storage in memory or a file}.
-
+@cindex CANDELS survey
+For example, distortions have only been present in WCSLIB from version 5.15
(released in mid 2016).
+Therefore some pipelines still apply their own specific set of WCS keywords
for distortions and put them into the image header along with those that WCSLIB
does recognize.
+So now that WCSLIB recognizes most of the standard distortion parameters, they
will get confused with the old ones and give wrong results.
+For example, in the CANDELS-GOODS South images that were created before WCSLIB
5.15@footnote{@url{https://archive.stsci.edu/pub/hlsp/candels/goods-s/gs-tot/v1.0/}}.
-@item Blank pixels in Arithmetic
-Blank pixels in the image (see @ref{Blank pixels}) will be stored based on the
data type.
-When the input is floating point type, blank values are NaN.
-One aspect of NaN values is that by definition they will fail on @emph{any}
comparison.
-Also, any operator that includes a NaN as a an operand will produce a NaN
(irrespective of its other operands).
-Hence both equal and not-equal operators will fail when both their operands
are NaN!
-Therefore, the only way to guarantee selection of blank pixels is through the
@command{isblank} operator explained above.
+The two @option{--hstartwcs} and @option{--hendwcs} are thus provided so when
using older datasets, you can specify what region in the FITS headers you want
to use to read the WCS keywords.
+Note that this is only relevant for reading the WCS information, basic data
information like the image size are read separately.
+These two options will only be considered when the value to @option{--hendwcs}
is larger than that of @option{--hstartwcs}.
+So if they are equal or @option{--hstartwcs} is larger than
@option{--hendwcs}, then all the input keywords will be parsed to get the WCS
information of the image.
-One way you can exploit this property of the NaN value to your advantage is
when you want a fully zero-valued image (even over the blank pixels) based on
an already existing image (with same size and world coordinate system settings).
-The following command will produce this for you:
+@item --hendwcs=INT
+Specify the last keyword card to read for specifying the image world
coordinate system on the input images.
+See @option{--hstartwcs}
-@example
-$ astarithmetic input.fits nan eq --output=all-zeros.fits
-@end example
+@end table
@noindent
-Note that on the command-line you can write NaN in any case (for example,
@command{NaN}, or @command{NAN} are also acceptable).
-Reading NaN as a floating point number in Gnuastro is not case-sensitive.
-@end table
+Crop box parameters:
+@table @option
-@menu
-* Basic mathematical operators:: For example, +, -, /, log, and pow.
-* Trigonometric and hyperbolic operators:: sin, cos, atan, asinh, etc.
-* Constants:: Physical and Mathematical constants.
-* Unit conversion operators:: Various unit conversions necessary.
-* Statistical operators:: Statistics of a single dataset (for example,
mean).
-* Stacking operators:: Coadding or combining multiple datasets into
one.
-* Filtering operators:: Smoothing a dataset through mixing pixel with
neighbors.
-* Pooling operators:: Reducing size through statistics of pixels in
window.
-* Interpolation operators:: Giving blank pixels a value.
-* Dimensionality changing operators:: Collapse or expand a dataset.
-* Conditional operators:: Select certain pixels within the dataset.
-* Mathematical morphology operators:: Work on binary images, for example,
erode.
-* Bitwise operators:: Work on bits within one pixel.
-* Numerical type conversion operators:: Convert the numeric datatype of a
dataset.
-* Random number generators:: Random numbers can be used to add noise for
example.
-* Box shape operators:: Dealing with box shapes and coordinates of
vertices.
-* Loading external columns:: Read a column from a table into the stack.
-* Size and position operators:: Extracting image size and pixel positions.
-* Building new dataset and stack management:: How to construct an empty
dataset from scratch.
-* Operand storage in memory or a file:: Tools for complex operations in one
command.
-@end menu
+@item -c FLT[,FLT[,...]]
+@itemx --center=FLT[,FLT[,...]]
+The central position of the crop in the input image.
+The positions along each dimension must be separated by a comma (@key{,}) and
fractions are also acceptable.
+The comma-separated values can either be in degrees (a single number), or
sexagesimal (@code{_h_m_} for RA, @code{_d_m_} for Dec, or @code{_:_:_} for
both).
-@node Basic mathematical operators, Trigonometric and hyperbolic operators,
Arithmetic operators, Arithmetic operators
-@subsubsection Basic mathematical operators
+The number of values given to this option must be the same as the dimensions
of the input dataset.
+The width of the crop should be set with @code{--width}.
+The units of the coordinates are read based on the value to the
@option{--mode} option, see below.
-These are some of the most common operations you will be doing on your data
and include, so no further explanation is necessary.
-If you are new to Gnuastro, just read the description of each carefully.
+@item -O STR
+@itemx --mode=STR
+Mode to interpret the crop's coordinates (for example with @option{--center},
@option{--catalog} or @option{--polygon}).
+The value must either be @option{img} (to assume image/pixel coordinates) or
@option{wcs} (to assume WCS, usually RA/Dec, coordinates), see @ref{Crop modes}
for a full description.
-@table @command
+@item -w FLT[,FLT[,...]]
+@itemx --width=FLT[,FLT[,...]]
+Width of the cropped region about coordinate given to @option{--center}.
+If in WCS mode, value(s) given to this option will be read in the same units
as the dataset's WCS information along this dimension (unless
@option{--widthinpix} is given).
+This option may take either a single value (to be used for all dimensions:
@option{--width=10} in image-mode will crop a @mymath{10\times10} pixel image)
or multiple values (a specific value for each dimension: @option{--width=10,20}
in image-mode will crop a @mymath{10\times20} pixel image).
-@item +
-Addition, so ``@command{4 5 +}'' is equivalent to @mymath{4+5}.
-For example, in the command below, the value 20000 is added to each pixel's
value in @file{image.fits}:
-@example
-$ astarithmetic 20000 image.fits +
-@end example
-You can also use this operator to sum the values of one pixel in two images
(which have to be the same size).
-For example, in the commands below (which are identical, see paragraph after
the commands), each pixel of @file{sum.fits} is the sum of the same pixel's
values in @file{a.fits} and @file{b.fits}.
-@example
-$ astarithmetic a.fits b.fits + -h1 -h1 --output=sum.fits
-$ astarithmetic a.fits b.fits + -g1 --output=sum.fits
-@end example
-The HDU/extension has to be specified for each image with @option{-h}.
-However, if the HDUs are the same in all inputs, you can use @option{-g} to
only specify the HDU once
+The @code{--width} option also accepts fractions.
+For example, if you want the width of your crop to be 3 by 5 arcseconds along
RA and Dec respectively and you are in wcs-mode, you can use:
@option{--width=3/3600,5/3600}.
-If you need to add more than one dataset, one way is to use this operator
multiple times, for example, see the two commands below that are identical in
the Reverse Polish Notation (@ref{Reverse polish notation}):
-@example
-$ astarithmetic a.fits b.fits + c.fits + -osum.fits
-$ astarithmetic a.fits b.fits c.fits + + -osum.fits
-@end example
+The final output will have an odd number of pixels to allow easy
identification of the pixel which keeps your requested coordinate (from
@option{--center} or @option{--catalog}).
+If you want an even sided crop, you can run Crop afterwards with
@option{--section=":*-1,:*-1"} or @option{--section=2:,2:} (depending on which
side you do not need), see @ref{Crop section syntax}.
-However, this can get annoying/buggy if you have more than three or four
images, in that case, a better way to sum data is to use the @code{sum}
operator (which also ignores blank pixels), that is discussed in @ref{Stacking
operators}.
+The basic reason for making an odd-sided crop is that your given central
coordinate will ultimately fall within a discrete pixel in the image (defined
by the FITS standard).
+When the crop has an odd number of pixels in each dimension, that pixel can be
very well defined as the ``central'' pixel of the crop, making it unambiguously
easy to identify.
+However, for an even-sided crop, it will be very hard to identify the central
pixel (it can be on any of the four pixels adjacent to the central point of the
image!).
-@cartouche
-@noindent
-@strong{NaN values:} if a single argument of @code{+} has a NaN value, the
output will also be NaN.
-To ignore NaN values, use the @code{sum} operator of @ref{Stacking operators}.
-You can see the difference with the two commands below:
+@item -X
+@itemx --widthinpix
+In WCS mode, interpret the value to @option{--width} as number of pixels, not
the WCS units like degrees.
+This is useful when you want a fixed crop size in pixels, even though your
center coordinates are in WCS (for example, RA and Dec).
-@example
-$ astarithmetic --quiet 1.0 2.0 3.0 nan + + +
-nan
-$ astarithmetic --quiet 1.0 2.0 3.0 nan 4 sum
-6.000000e+00
-@end example
+@item -l STR
+@itemx -l FLT:FLT,...
+@itemx --polygon=STR
+@itemx --polygon=FLT,FLT:FLT,FLT:...
+@cindex Sexagesimal
+Polygon vertice coordinates (when value is in @option{FLT,FLT:FLT,FLT:...}
format) or the filename of a SAO DS9 region file (when the value has no
@file{,} or @file{:} characters).
+Each vertice can either be in degrees (a single floating point number) or
sexagesimal (in formats of `@code{_h_m_}' for RA and `@code{_d_m_}' for Dec, or
simply `@code{_:_:_}' for either of them).
-The same goes for all the @ref{Stacking operators} so if your data may include
NaN pixels, be sure to use the stacking operators.
-@end cartouche
+The vertices are used to define the polygon: in the same order given to this
option.
+When the vertices are not necessarily ordered in the proper order (for
example, one vertice in a square comes after its diagonal opposite), you can
add the @option{--polygonsort} option which will attempt to sort the vertices
before cropping.
+Note that for concave polygons, sorting is not recommended because there is no
unique solution, for more, see the description under @option{--polygonsort}.
-@item -
-Subtraction, so ``@command{4 5 -}'' is equivalent to @mymath{4-5}.
-Usage of this operator is similar to @command{+} operator, for example:
-@example
-$ astarithmetic 20000 image.fits -
-$ astarithmetic a.fits b.fits - -g1 --output=sub.fits
-@end example
+This option can be used both in the image and WCS modes, see @ref{Crop modes}.
+If a SAO DS9 region file is used, the coordinate mode of Crop will be
determined by the contents of the file and any value given to @code{--mode} is
ignored.
+The cropped image will be the size of the rectangular region that completely
encompasses the polygon.
+By default all the pixels that are outside of the polygon will be set as blank
values (see @ref{Blank pixels}).
+However, if @option{--polygonout} is called all pixels internal to the
vertices will be set to blank.
+In WCS-mode, you may provide many FITS images/tiles: Crop will stitch them to
produce this cropped region, then apply the polygon.
-@item x
-Multiplication, so ``@command{4 5 x}'' is equivalent to @mymath{4\times5}.
-For example, in the command below, the value of each output pixel is 5 times
its value in @file{image.fits}:
-@example
-$ astarithmetic image.fits 5 x
-@end example
-And you can multiply the value of each pixel in two images, like this:
-@example
-$ astarithmetic a.fits a.fits x -g1 –output=multip.fits
-@end example
+The syntax for the polygon vertices is similar to, and simpler than, that for
@option{--section}.
+In short, the dimensions of each coordinate are separated by a comma (@key{,})
and each vertex is separated by a colon (@key{:}).
+You can define as many vertices as you like.
+If you would like to use space characters between the dimensions and vertices
to make them more human-readable, then you have to put the value to this option
in double quotation marks.
-@item /
-Division, so ``@command{4 5 /}'' is equivalent to @mymath{4/5}.
-Like the multiplication, for example
-@example
-$ astarithmetic image.fits 5 -h1 /
-$ astarithmetic a.fits b.fits / -g1 –output=div.fits
-@end example
+For example, let's assume you want to work on the deepest part of the WFC3/IR
images of Hubble Space Telescope eXtreme Deep Field (HST-XDF).
+@url{https://archive.stsci.edu/prepds/xdf/, According to the web
page}@footnote{@url{https://archive.stsci.edu/prepds/xdf/}} the deepest part is
contained within the coordinates:
-@item %
-Modulo (remainder), so ``@command{3 2 %}'' will return @mymath{1}.
-Note that the modulo operator only works on integer types (see @ref{Numeric
data types}).
-This operator is therefore not defined for most processed astronomical
astronomical images that have floating-point value.
-However it is useful in labeled images, for example, @ref{Segment output}).
-In such cases, each pixel is the integer label of the object it is associated
with hence with the example command below, we can change the labels to only be
between 1 and 4 and decrease all objects on the image to 4/5th (all objects
with a label that is a multiple of 5 will be set to 0).
@example
-$ astarithmetic label.fits 5 1 %
+[ (53.187414,-27.779152), (53.159507,-27.759633),
+ (53.134517,-27.787144), (53.161906,-27.807208) ]
@end example
-@item abs
-Absolute value of first operand, so ``@command{4 abs}'' is equivalent to
@mymath{|4|}.
-For example, the output of the command bellow will not have any negative
pixels (all negative pixels will be multiplied by @mymath{-1} to become
positive)
+They have provided mask images with only these pixels in the WFC3/IR images,
but what if you also need to work on the same region in the full resolution ACS
images? Also what if you want to use the CANDELS data for the shallow region?
Running Crop with @option{--polygon} will easily pull out this region of the
image for you, irrespective of the resolution.
+If you have set the operating mode to WCS mode in your nearest configuration
file (see @ref{Configuration files}), there is no need to call
@option{--mode=wcs} on the command-line.
+
@example
-$ astarithmetic image.fits abs
+$ astcrop --mode=wcs desired-filter-image(s).fits \
+ --polygon="53.187414,-27.779152 : 53.159507,-27.759633 : \
+ 53.134517,-27.787144 : 53.161906,-27.807208"
@end example
+@cindex SAO DS9 region file
+@cindex Region file (SAO DS9)
+More generally, you have an image and want to define the polygon yourself (it
is not already published like the example above).
+As the number of vertices increases, checking the vertex coordinates on a FITS
viewer (for example, SAO DS9) and typing them in, one by one, can be very
tedious and prone to typo errors.
+In such cases, you can make a polygon ``region'' in DS9 and using your mouse,
easily define (and visually see) it. Given that SAO DS9 has a graphic user
interface (GUI), if you do not have the polygon vertices before-hand, it is
much more easier build your polygon there and pass it onto Crop through the
region file.
-@item pow
-First operand to the power of the second, so ``@command{4.3 5 pow}'' is
equivalent to @mymath{4.3^{5}}.
-For example, with the command below all pixels will be squared
-@example
-$ astarithmetic image.fits 2 pow
-@end example
+You can take the following steps to make an SAO DS9 region file containing
your polygon.
+Open your desired FITS image with SAO DS9 and activate its ``region'' mode
with @clicksequence{Edit@click{}Region}.
+Then define the region as a polygon with
@clicksequence{Region@click{}Shape@click{}Polygon}.
+Click on the approximate center of the region you want and a small square will
appear.
+By clicking on the vertices of the square you can shrink or expand it,
clicking and dragging anywhere on the edges will enable you to define a new
vertex.
+After the region has been nicely defined, save it as a file with
@clicksequence{Region@click{}``Save Regions''}.
+You can then select the name and address of the output file, keep the format
as @command{REG (*.reg)} and press the ``OK'' button.
+In the next window, keep format as ``ds9'' and ``Coordinate System'' as
``fk5'' for RA and Dec (or ``Image'' for pixel coordinates).
+A plain text file is now created (let's call it @file{ds9.reg}) which you can
pass onto Crop with @command{--polygon=ds9.reg}.
-@item sqrt
-The square root of the first operand, so ``@command{5 sqrt}'' is equivalent to
@mymath{\sqrt{5}}.
-Since the square root is only defined for positive values, any negative-valued
pixel will become NaN (blank).
-The output will have a floating point type, but its precision is determined
from the input: if the input is a 64-bit floating point, the output will also
be 64-bit.
-Otherwise, the output will be 32-bit floating point (see @ref{Numeric data
types} for the respective precision).
-Therefore if you require 64-bit precision in estimating the square root,
convert the input to 64-bit floating point first, for example, with @code{5
float64 sqrt}.
-For example, each pixel of the output of the command below will be the square
root of that pixel in the input.
-@example
-$ astarithmetic image.fits sqrt
-@end example
+For the expected format of the region file, see the description of
@code{gal_ds9_reg_read_polygon} in @ref{SAO DS9 library}.
+However, since SAO DS9 makes this file for you, you do not usually need to
worry about its internal format unless something un-expected happens and you
find a bug.
-If you just want to scale an image with negative values using this operator
(for better visual inspection, and the actual values do not matter for you),
you can subtract the image from its minimum value, then take its square root:
+@item --polygonout
+Keep all the regions outside the polygon and mask the inner ones with blank
pixels (see @ref{Blank pixels}).
+This is practically the inverse of the default mode of treating polygons.
+Note that this option only works when you have only provided one input image.
+If multiple images are given (in WCS mode), then the full area covered by all
the images has to be shown and the polygon excluded.
+This can lead to a very large area if large surveys like COSMOS are used.
+So Crop will abort and notify you.
+In such cases, it is best to crop out the larger region you want, then mask
the smaller region with this option.
-@example
-$ astarithmetic image.fits image.fits minvalue - sqrt -g1
-@end example
+@item --polygonsort
+Sort the given set of vertices to the @option{--polygon} option.
+For a concave polygon it will sort the vertices correctly, however for a
convex polygon it there is no unique sorting, so be careful because the crop
may not be what you expected.
-Alternatively, to avoid reading the image into memory two times, you can use
the @option{set-} operator to read it into the variable @option{i} and use
@option{i} two times to speed up the operation (described below):
+@cindex Convex polygons
+@cindex Concave polygons
+@cindex Polygons, Convex
+@cindex Polygons, Concave
+Polygons come in two classes: convex and concave (or generally, non-convex!),
see below for a demonstration.
+Convex polygons are those where all inner angles are less than 180 degrees.
+By contrast, a concave polygon is one where an inner angle may be more than
180 degrees.
@example
-$ astarithmetic image.fits set-i i i minvalue - sqrt
-@end example
+ Concave Polygon Convex Polygon
-@item log
-Natural logarithm of first operand, so ``@command{4 log}'' is equivalent to
@mymath{ln(4)}.
-Negative pixels will become NaN, and the output type is determined from the
input, see the explanation under @command{sqrt} for more on these features.
-For example, the command below will take the natural logarithm of every pixel
in the input.
-@example
-$ astarithmetic image.fits log --output=log.fits
+ D --------C D------------- C
+ \ | E / |
+ \E | \ |
+ / | \ |
+ A--------B A ----------B
@end example
-@item log10
-Base-10 logarithm of first popped operand, so ``@command{4 log}'' is
equivalent to @mymath{log_{10}(4)}.
-Negative pixels will become NaN, and the output type is determined from the
input, see the explanation under @command{sqrt} for more on these features.
-For example, the command below will take the base-10 logarithm of every pixel
in the input.
-@example
-$ astarithmetic image.fits log10
-@end example
-@end table
-
-@node Trigonometric and hyperbolic operators, Constants, Basic mathematical
operators, Arithmetic operators
-@subsubsection Trigonometric and hyperbolic operators
-
-All the trigonometric and hyperbolic functions are described here.
-One good thing with these operators is that they take inputs and outputs in
degrees (which we usually need as input or output), not radians (like most
other programs/libraries).
+@item -s STR
+@itemx --section=STR
+Section of the input image which you want to be cropped.
+See @ref{Crop section syntax} for a complete explanation on the syntax
required for this input.
-@table @command
+@item -C FITS/TXT
+@itemx --catalog=FITS/TXT
+File name of catalog for making multiple crops from the input images/cubes.
+The catalog can be in any of Gnuastro's recognized @ref{Recognized table
formats}.
+The columns containing the coordinates for the crop centers can be specified
with the @option{--coordcol} option (using column names or numbers, see
@ref{Selecting table columns}).
+The catalog can also contain the name of each crop, you can specify the column
containing the name with the @option{--namecol}.
-@item sin
-@itemx cos
-@itemx tan
-@cindex Trigonometry
-Basic trigonometric functions.
-They take one operand, in units of degrees.
+@item --cathdu=STR/INT
+The HDU (extension) containing the catalog (if the file given to
@option{--catalog} is a FITS file).
+This can either be the HDU name (if it has one) or number (counting from 0).
+By default (if this option is not given), the second HDU will be used
(equivalent to @option{--cathdu=1}.
+For more on how to specify the HDU, see the explanation of the @option{--hdu}
option in @ref{Input output options}.
-@item asin
-@itemx acos
-@itemx atan
-Inverse trigonometric functions.
-They take one operand and the returned values are in units of degrees.
+@item -x STR/INT
+@itemx --coordcol=STR/INT
+The column in a catalog to read as a coordinate.
+The value can be either the column number (starting from 1), or a match/search
in the table meta-data, see @ref{Selecting table columns}.
+This option must be called multiple times, depending on the number of
dimensions in the input dataset.
+If it is called more than necessary, the extra columns (later calls to this
option on the command-line or configuration files) will be ignored, see
@ref{Configuration file precedence}.
-@item atan2
-Inverse tangent (output in units of degrees) that uses the signs of the input
coordinates to distinguish between the quadrants.
-This operator therefore needs two operands: the first popped operand is
assumed to be the X axis position of the point, and the second popped operand
is its Y axis coordinate.
+@item -n STR/INT
+@item --namecol=STR/INT
+Column selection of crop file name.
+The value can be either the column number (starting from 1), or a match/search
in the table meta-data, see @ref{Selecting table columns}.
+This option can be used both in Image and WCS modes, and not a mandatory.
+When a column is given to this option, the final crop base file name will be
taken from the contents of this column.
+The directory will be determined by the @option{--output} option (current
directory if not given) and the value to @option{--suffix} will be appended.
+When this column is not given, the row number will be used instead.
-For example, see the commands below.
-To be more clear, we are using Table's @ref{Column arithmetic} which uses
exactly the same internal library function as the Arithmetic program for images.
-We are showing the results for four points in the four quadrants of the 2D
space (if you want to try running them, you do not need to type/copy the parts
after @key{#}).
-The first point (2,2) is in the first quadrant, therefore the returned angle
is 45 degrees.
-But the second, third and fourth points are in the quadrants of the same
order, and the returned angles reflect the quadrant.
+@end table
-@example
-$ echo " 2 2" | asttable -c'arith $2 $1 atan2' # --> 45
-$ echo " 2 -2" | asttable -c'arith $2 $1 atan2' # --> -45
-$ echo "-2 -2" | asttable -c'arith $2 $1 atan2' # --> -135
-$ echo "-2 2" | asttable -c'arith $2 $1 atan2' # --> 135
-@end example
-However, if you simply use the classic arc-tangent operator (@code{atan}) for
the same points, the result will only be in two quadrants as you see below:
-@example
-$ echo " 2 2" | asttable -c'arith $2 $1 / atan' # --> 45
-$ echo " 2 -2" | asttable -c'arith $2 $1 / atan' # --> -45
-$ echo "-2 -2" | asttable -c'arith $2 $1 / atan' # --> 45
-$ echo "-2 2" | asttable -c'arith $2 $1 / atan' # --> -45
-@end example
-@item sinh
-@itemx cosh
-@itemx tanh
-@cindex Hyperbolic functions
-Hyperbolic sine, cosine, and tangent.
-These operators take a single operand.
+@node Crop output, Crop known issues, Crop options, Invoking astcrop
+@subsubsection Crop output
-@item asinh
-@itemx acosh
-@itemx atanh
-Inverse Hyperbolic sine, cosine, and tangent.
-These operators take a single operand.
-@end table
+The string given to @option{--output} option will be interpreted depending on
how many crops were requested, see @ref{Crop modes}:
-@node Constants, Unit conversion operators, Trigonometric and hyperbolic
operators, Arithmetic operators
-@subsubsection Constants
-@cindex Pi
-During your analysis it is often necessary to have certain constants like the
number @mymath{\pi}.
-The ``operators'' in this section do not actually take any operand, they just
replace the desired constant into the stack.
-So in effect, these are actually operands.
-But since their value is not inserted by the user, we have placed them in the
list of operators.
+@itemize
+@item
+When a catalog is given, the value of the @option{--output} (see @ref{Common
options}) will be read as the directory to store the output cropped images.
+Hence if it does not already exist, Crop will abort with an ``No such file or
directory'' error.
-@table @code
-@item e
-@cindex e (base of natural logarithm)
-@cindex Euler's number (@mymath{e})
-@cindex Base of natural logarithm (@mymath{e})
-Euler’s number, or the base of the natural logarithm (no units).
-See @url{https://en.wikipedia.org/wiki/E_(mathematical_constant), Wikipedia}.
+The crop file names will consist of two parts: a variable part (the row number
of each target starting from 1) along with a fixed string which you can set
with the @option{--suffix} option.
+Optionally, you may also use the @option{--namecol} option to define a column
in the input catalog to use as the file name instead of numbers.
-@item pi
-@cindex Pi
-Ratio of circle’s circumference to its diameter (no units).
-See @url{https://en.wikipedia.org/wiki/Pi, Wikipedia}.
+@item
+When only one crop is desired, the value to @option{--output} will be read as
a file name.
+If no output is specified or if it is a directory, the output file name will
follow the automatic output names of Gnuastro, see @ref{Automatic output}: The
string given to @option{--suffix} will be replaced with the @file{.fits} suffix
of the input.
+@end itemize
-@item c
-@cindex Speed of light
-The speed of light in vacuum, in units of @mymath{m/s}.
-see @url{https://en.wikipedia.org/wiki/Speed_of_light, Wikipedia}.
+By default, as suggested by the FITS standard and implemented in all Gnuastro
programs, the first/primary extension of the output files will only contain
metadata.
+The cropped images/cubes will be written into the 2nd HDU of their respective
FITS file (which is actually counted as @code{1} because HDU counting starts
from @code{0}).
+However, if you want the cropped data to be written into the primary (0-th)
HDU, run Crop with the @option{--primaryimghdu} option.
-@item G
-@cindex @mymath{g} (gravitational constant)
-@cindex Gravitational constant (@mymath{g})
-The gravitational constant, in units of @mymath{m^3/kg/s^2}.
-See @url{https://en.wikipedia.org/wiki/Gravitational_constant, Wikipedia}.
+If the output file already exists by default Crop will re-write it (so that
all existing HDUs in it will be deleted).
+If you want the cropped HDU to be appended to existing HDUs, use
@option{--append} described below.
-@item h
-@cindex @mymath{h} (Plank's constant)
-@cindex Plank's constant (@mymath{h})
-Plank's constant, in units of @mymath{J/Hz} or @mymath{kg\times m^2/s}.
-See @url{https://en.wikipedia.org/wiki/Planck_constant, Wikipedia}.
+The header of each output cropped image will contain the names of the input
image(s) it was cut from.
+If a name is longer than the 70 character space that the FITS standard allows
for header keyword values, the name will be cut into several keywords from the
nearest slash (@key{/}).
+The keywords have the following format: @command{ICFn_m} (for Crop File).
+Where @command{n} is the number of the image used in this crop and @command{m}
is the part of the name (it can be broken into multiple keywords).
+Following the name is another keyword named @command{ICFnPIX} which shows the
pixel range from that input image in the same syntax as @ref{Crop section
syntax}.
+So this string can be directly given to the @option{--section} option later.
-@item au
-@cindex Astronomical Unit (AU)
-@cindex AU (Astronomical Unit)
-Astronomical Unit, in units of meters.
-See @url{https://en.wikipedia.org/wiki/Astronomical_unit, Wikipedia}.
+Once done, a log file can be created in the current directory with the
@code{--log} option.
+This file will have three columns and the same number of rows as the number of
cropped images.
+There are also comments on the top of the log file explaining basic
information about the run and descriptions for the columns.
+A short description of the columns is also given below:
-@item ly
-@cindex Light year
-Distance covered by light in vacuum in one year, in units of meters.
-See @url{https://en.wikipedia.org/wiki/Light-year, Wikipedia}.
+@enumerate
+@item
+The cropped image file name for that row.
+@item
+The number of input images that were used to create that image.
+@item
+A @code{0} if the central few pixels (value to the @option{--checkcenter}
option) are blank and @code{1} if they are not.
+When the crop was not defined by its center (see @ref{Crop modes}), or
@option{--checkcenter} was given a value of 0 (see @ref{Invoking astcrop}), the
center will not be checked and this column will be given a value of @code{-1}.
+@end enumerate
-@item avogadro
-@cindex Avogradro's number
-Avogadro's constant, in units of @mymath{1/mol}.
-See @url{https://en.wikipedia.org/wiki/Avogadro_constant, Wikipedia}.
+If the output crop(s) have a single element (pixel in an image) and
@option{--oneelemstdout} has been called, no output file will be produced!
+Instead, the single element's value is printed on the standard output.
+See the description of @option{--oneelemstdout} below for more:
-@item fine-structure
-@cindex Fine structure constant
-The fine-structure constant (no units).
-See @url{https://en.wikipedia.org/wiki/Fine-structure_constant, Wikipedia}.
-@end table
+@table @option
+@item -p STR
+@itemx --suffix=STR
+The suffix (or post-fix) of the output files for when you want all the cropped
images to have a special ending.
+One case where this might be helpful is when besides the science images, you
want the weight images (or exposure maps, which are also distributed with
survey images) of the cropped regions too.
+So in one run, you can set the input images to the science images and
@option{--suffix=_s.fits}.
+In the next run you can set the weight images as input and
@option{--suffix=_w.fits}.
-@node Unit conversion operators, Statistical operators, Constants, Arithmetic
operators
-@subsubsection Unit conversion operators
+@item -a STR
+@itemx --metaname=STR
+Name of cropped HDU (value to the @code{EXTNAME} keyword of FITS).
+If not given, a default @code{CROP} will be placed there (so the
@code{EXTNAME} keyword will always be present in the output).
+If crop produces many outputs from a catalog, they will be given the same
string as @code{EXTNAME} (the file names containing the cropped HDU will be
different).
-It often happens that you have data in one unit (for example, counts on your
CCD), but would like to convert it into another (for example, magnitudes, to
measure the brightness of a galaxy).
-While the equations for the unit conversions can be easily found on the
internet, the operators in this section are designed to simplify the process
and let you do it easily and fast without having to remember constants and
relations.
+@item -A
+@itemx --append
+If the output file already exists, append the cropped image HDU to the end of
any existing HDUs.
+By default (when this option isn't given), if an output file already exists,
any existing HDU in it will be deleted.
+If the output file doesn't exist, this option is redundant.
-@table @command
+@item --primaryimghdu
+Write the output into the primary (0-th) HDU/extension of the output.
+By default, like all Gnuastro's default outputs, no data is written in the
primary extension because the FITS standard suggests keeping that extension
free of data and only for metadata.
-@item counts-to-mag
-Convert counts (usually CCD outputs) to magnitudes using the given zero point.
-The zero point is the first popped operand and the count image or value is the
second popped operand.
+@item -t
+@itemx --oneelemstdout
+When a crop only has a single element (a single pixel), print it to the
standard output instead of making a file.
+By default (without this option), a single-pixel crop will be saved to a file,
just like a crop of any other size.
-For example, assume you have measured the standard deviation of the noise in
an image to be @code{0.1} counts, and the image's zero point is @code{22.5} and
you want to measure the @emph{per-pixel} surface brightness limit of the
dataset@footnote{The @emph{per-pixel} surface brightness limit is the magnitude
of the noise standard deviation. For more on surface brightness see
@ref{Brightness flux magnitude}.
-In the example command, because the output is a single number, we are using
@option{--quiet} to avoid printing extra information.}.
-To apply this operator on an image, simply replace @code{0.1} with the image
name, as described below.
+When a single crop is requested (either through @option{--center}, or a
catalog of one row is given), the single value alone is printed with nothing
else.
+This makes it easy to immediately write the value into a shell variable for
example:
@example
-$ astarithmetic 0.1 22.5 counts-to-mag --quiet
+value=$(astcrop img.fits --mode=wcs --center=1.234,5.678 \
+ --width=1 --widthinpix --oneelemstdout \
+ --quiet)
@end example
-Of course, you can also convert every pixel in an image (or table column in
Table's @ref{Column arithmetic}) with this operator if you replace the second
popped operand with an image/column name.
-For an example of applying this operator on an image, see the description of
surface brightness in @ref{Brightness flux magnitude}, where we will convert an
image's pixel values to surface brightness.
+If a catalog of coordinates is given (that would produce multiple crops; or
multiple values in this scenario), the solution for a single value will not
work!
+Recall that Crop will do the crops in parallel, therefore each time you run
it, the order of the rows will be different and not correspond to the order of
the inputs.
-@item mag-to-counts
-Convert magnitudes to counts (usually CCD outputs) using the given zero point.
-The zero point is the first popped operand and the magnitude value is the
second.
-For example, if an object has a magnitude of 20, you can estimate the counts
corresponding to it (when the image has a zero point of 24.8) with this command:
-Note that because the output is a single number, we are using @option{--quiet}
to avoid printing extra information.
+To allow identification of each value (which row of the input catalog it
corresponds to), Crop will first print the name of the would-be created file
name, and print the value after it (separated by an empty SPACE character).
+In other words, the file in the first column will not actually be created, but
the value of the pixel it would have contained (if this option was not called)
is printed after it.
-@example
-$ astarithmetic 20 24.8 mag-to-counts --quiet
-@end example
+@item -c FLT/INT
+@itemx --checkcenter=FLT/INT
+@cindex Check center of crop
+Square box width of region in the center of the image to check for blank
values.
+If any of the pixels in this central region of a crop (defined by its center)
are blank, then it will not be stored in an output file.
+If the value to this option is zero, no checking is done.
+This check is only applied when the cropped region(s) are defined by their
center (not by the vertices, see @ref{Crop modes}).
-@item counts-to-sb
-Convert counts to surface brightness using the zero point and area (in units
of arcsec@mymath{^2}).
-The first popped operand is the area (in arcsec@mymath{^2}), the second popped
operand is the zero point and the third are the count values.
-Estimating the surface brightness involves taking the logarithm.
-Therefore this operator will produce NaN for counts with a negative value.
+The units of the value are interpreted based on the @option{--mode} value (in
WCS or pixel units).
+The ultimate checked region size (in pixels) will be an odd integer around the
center (converted from WCS, or when an even number of pixels are given to this
option).
+In WCS mode, the value can be given as fractions, for example, if the WCS
units are in degrees, @code{0.1/3600} will correspond to a check size of 0.1
arcseconds.
-For example, with the commands below, we read the zero point from the image
headers (assuming it is in the @code{ZPOINT} keyword), we calculate the pixel
area from the image itself, and we call this operator to convert the image
pixels (in counts) to surface brightness (mag/arcsec@mymath{^2}).
+Because survey regions do not often have a clean square or rectangle shape,
some of the pixels on the sides of the survey FITS image do not commonly have
any data and are blank (see @ref{Blank pixels}).
+So when the catalog was not generated from the input image, it often happens
that the image does not have data over some of the points.
-@example
-$ zeropoint=$(astfits image.fits --keyvalue=ZPOINT -q)
-$ pixarea=$(astfits image.fits --pixelareaarcsec2)
-$ astarithmetic image.fits $zeropoint $pixarea counts-to-sb \
- --output=image-sb.fits
-
-@end example
-For more on the definition of surface brightness see @ref{Brightness flux
magnitude}, and for a fully tutorial on optimal usage of this, see @ref{FITS
images in a publication}.
+When the given center of a crop falls in such regions or outside the dataset,
and this option has a non-zero value, no crop will be created.
+Therefore with this option, you can specify a width of a small box (3 pixels
is often good enough) around the central pixel of the cropped image.
+You can check which crops were created and which were not from the
command-line (if @option{--quiet} was not called, see @ref{Operating mode
options}), or in Crop's log file (see @ref{Crop output}).
-@item sb-to-counts
-Convert surface brightness using the zero point and area (in units of
arcsec@mymath{^2}) to counts.
-The first popped operand is the area (in arcsec@mymath{^2}), the second popped
operand is the zero point and the third are the surface brightness values.
-See the description of @command{counts-to-sb} for more.
+@item -b
+@itemx --noblank
+Pixels outside of the input image that are in the crop box will not be used.
+By default they are filled with blank values (depending on type), see
@ref{Blank pixels}.
+This option only applies only in Image mode, see @ref{Crop modes}.
-@item mag-to-sb
-Convert magnitudes to surface brightness over a certain area (in units of
arcsec@mymath{^2}).
-The first popped operand is the area and the second is the magnitude.
-For example, let's assume you have a table with the two columns of magnitude
(called @code{MAG}) and area (called @code{AREAARCSEC2}).
-In the command below, we will use @ref{Column arithmetic} to return the
surface brightness.
-@example
-$ asttable table.fits -c'arith MAG AREAARCSEC2 mag-to-sb'
-@end example
+@item -z
+@itemx --zeroisnotblank
+In float or double images, it is common to give the value of zero to blank
pixels.
+If the input image type is one of these two types, such pixels will also be
considered as blank.
+You can disable this behavior with this option, see @ref{Blank pixels}.
+@end table
-@item sb-to-mag
-Convert surface brightness to magnitudes over a certain area (in units of
arcsec@mymath{^2}).
-The first popped operand is the area and the second is the magnitude.
-See the description of @code{mag-to-sb} for more.
-@item counts-to-jy
-@cindex AB magnitude
-@cindex Magnitude, AB
-Convert counts (usually CCD outputs) to Janskys through an AB-magnitude based
zero point.
-The top-popped operand is assumed to be the AB-magnitude zero point and the
second-popped operand is assumed to be a dataset in units of counts (an image
in Arithmetic, and a column in Table's @ref{Column arithmetic}).
-For the full equation and basic definitions, see @ref{Brightness flux
magnitude}.
+@node Crop known issues, , Crop output, Invoking astcrop
+@subsubsection Crop known issues
-@cindex SDSS
-@cindex Nanomaggy
-For example, SDSS images are calibrated in units of nanomaggies, with a fixed
zero point magnitude of 22.5.
-Therefore you can convert the units of SDSS image pixels to Janskys with the
command below:
+When running Crop, you may encounter strange errors and bugs.
+In these cases, please report a bug and we will try to fix it as soon as
possible, see @ref{Report a bug}.
+However, some things are beyond our control, or may take too long to fix
directly.
+In this section we list such known issues that may occur in known cases and
suggest the hack (or work-around) to fix the problem:
-@example
-$ astarithmetic sdss-image.fits 22.5 counts-to-jy
-@end example
+@table @asis
+@item Crash with @samp{Killed} when cropping catalog from @file{.fits.gz}
+This happens because CFISTIO (that reads and writes FITS files) will
internally decompress the file in a temporary place (possibly in the RAM), then
start reading from it.
+On the other hand, by default when given a catalog (with many crops) and not
specifying @option{--numthreads}, Crop will use the maximum number of threads
available on your system to do each crop faster.
+On an normal (not compressed) file, parallel access will not cause a problem,
however, when attempting parallel access with the maximum number of threads on
a compressed file, CFITSIO crashes with @code{Killed}.
+Therefore the following solutions can be used to fix this crash:
-@item jy-to-counts
-Convert Janskys to counts (usually CCD outputs) through an AB-magnitude based
zero point.
-This is the inverse operation of the @code{counts-to-jy}, see there for usage
example.
+@itemize
+@item
+Decrease the number of threads (at the minimum, set @option{--numthreads=1}).
+Since this solution does not attempt to change any of your previous Crop
command components or does not change your local file structure, it is the
preferred way.
-@item counts-to-nanomaggy
-@cindex Nanomaggy
-Convert counts to Nanomaggy (with fixed zero point of 22.5, used as the pixel
units of many surveys like SDSS).
-For example if your image has a zero point of 24.93, you can convert it to
Nanomaggies with the command below:
+@item
+Decompress the file (with the command below) and feed the @file{.fits} file
into Crop without changing the number of threads.
@example
-$ astarithmetic image.fits 24.93 counts-to-nanomaggy
+$ gunzip -k image.fits.gz
@end example
+@end itemize
+@end table
-@item nanomaggy-to-counts
-@cindex Nanomaggy
-Convert Nanomaggy to counts.
-Nanomaggy is defined to have a fixed zero point of 22.5 and is the pixel units
of many surveys like SDSS.
-For example if you would like to convert an image in units of Nanomaggy (for
example from SDSS) to the counts of a camera with a zero point of 25.92, you
can use the command below:
-
-@example
-$ astarithmetic image.fits 25.92 nanomaggy-to-counts
-@end example
-@item mag-to-jy
-Convert AB magnitudes to Janskys, see @ref{Brightness flux magnitude}.
-@item jy-to-mag
-Convert Janskys to AB magnitude, see @ref{Brightness flux magnitude}.
-@item au-to-pc
-@cindex Parsecs
-@cindex Astronomical Units (AU)
-Convert Astronomical Units (AUs) to Parsecs (PCs).
-This operator takes a single argument which is interpreted to be the input AUs.
-The conversion is based on the definition of Parsecs: @mymath{1 \rm{PC} =
1/tan(1^{\prime\prime}) \rm{AU}}, where @mymath{1^{\prime\prime}} is one
arcseconds.
-In other words, @mymath{1 (\rm{PC}) = 648000/\pi (\rm{AU})}.
-For example, if we take Pluto's average distance to the Sun to be 40 AUs, we
can obtain its distance in Parsecs using this command:
-@example
-echo 40 | asttable -c'arith $1 au-to-pc'
-@end example
-@item pc-to-au
-Convert Parsecs (PCs) to Astronomical Units (AUs).
-This operator takes a single argument which is interpreted to be the input PCs.
-For more on the conversion equation, see description of @code{au-to-pc}.
-For example, Proxima Centauri (the nearest star to the Solar system) is 1.3020
Parsecs from the Sun, we can calculate this distance in units of AUs with the
command below:
-@example
-echo 1.3020 | asttable -c'arith $1 pc-to-au'
-@end example
-@item ly-to-pc
-@cindex Light-year
-Convert Light-years (LY) to Parsecs (PCs).
-This operator takes a single argument which is interpreted to be the input LYs.
-The conversion is done from IAU's definition of the light-year
(9460730472580800 m @mymath{\approx} 63241.077 AU = 0.306601 PC, for the
conversion of AU to PC, see the description of @code{au-to-pc}).
-For example, the distance of Andromeda galaxy to our galaxy is 2.5 million
light-years, so its distance in kilo-Parsecs can be calculated with the command
below (note that we want the output in kilo-parsecs, so we are dividing the
output of this operator by 1000):
-@example
-echo 2.5e6 | asttable -c'arith $1 ly-to-pc 1000 /'
-@end example
-@item pc-to-ly
-Convert Parsecs (PCs) to Light-years (LY).
-This operator takes a single argument which is interpreted to be the input PCs.
-For the conversion and an example of the inverse of this operator, see the
description of @code{ly-to-pc}.
-@item ly-to-au
-Convert Light-years (LY) to Astronomical Units (AUs).
-This operator takes a single argument which is interpreted to be the input LYs.
-For the conversion and a similar example, see the description of
@code{ly-to-pc}.
-@item au-to-ly
-Convert Astronomical Units (AUs) to Light-years (LY).
-This operator takes a single argument which is interpreted to be the input AUs.
-For the conversion and a similar example, see the description of
@code{ly-to-pc}.
-@end table
-@node Statistical operators, Stacking operators, Unit conversion operators,
Arithmetic operators
-@subsubsection Statistical operators
+@node Arithmetic, Convolve, Crop, Data manipulation
+@section Arithmetic
-The operators in this section take a single dataset as input, and will return
the desired statistic as a single value.
+It is commonly necessary to do operations on some or all of the elements of a
dataset independently (pixels in an image).
+For example, in the reduction of raw data it is necessary to subtract the Sky
value (@ref{Sky value}) from each image image.
+Later (once the images as warped into a single grid using Warp for example,
see @ref{Warp}), the images are co-added (the output pixel grid is the average
of the pixels of the individual input images).
+Arithmetic is Gnuastro's program for such operations on your datasets directly
from the command-line.
+It currently uses the reverse polish or post-fix notation, see @ref{Reverse
polish notation} and will work on the native data types of the input
images/data to reduce CPU and RAM resources, see @ref{Numeric data types}.
+For more information on how to run Arithmetic, please see @ref{Invoking
astarithmetic}.
-@table @command
-@item minvalue
-Minimum value in the first popped operand, so ``@command{a.fits minvalue}''
will push the minimum pixel value in this image onto the stack.
-When this operator acts on a single image, the output (operand that is put
back on the stack) will no longer be an image, but a number.
-The output of this operand is in the same type as the input.
-This operator is mainly intended for multi-element datasets (for example,
images or data cubes), if the popped operand is a number, it will just return
it without any change.
+@menu
+* Reverse polish notation:: The current notation style for Arithmetic
+* Integer benefits and pitfalls:: Integers have major benefits, but require
care
+* Arithmetic operators:: List of operators known to Arithmetic
+* Invoking astarithmetic:: How to run Arithmetic: options and output
+@end menu
-Note that when the final remaining/output operand is a single number, it is
printed onto the standard output.
-For example, with the command below the minimum pixel value in
@file{image.fits} will be printed in the terminal:
-@example
-$ astarithmetic image.fits minvalue
-@end example
+@node Reverse polish notation, Integer benefits and pitfalls, Arithmetic,
Arithmetic
+@subsection Reverse polish notation
-However, the output above also includes a lot of extra information that are
not relevant in this context.
-If you just want the final number, run Arithmetic in quiet mode:
-@example
-$ astarithmetic image.fits minvalue -q
-@end example
+@cindex Post-fix notation
+@cindex Reverse Polish Notation
+The most common notation for arithmetic operations is the
@url{https://en.wikipedia.org/wiki/Infix_notation, infix notation} where the
operator goes between the two operands, for example, @mymath{4+5}.
+The infix notation is the preferred way in most programming languages which
come with scripting features for large programs.
+This is because the infix notation requires a way to define precedence when
more than one operator is involved.
-Also see the description of @option{sqrt} for other example usages of this
operator.
+For example, consider the statement @code{5 + 6 / 2}.
+Should 6 first be divided by 2, then added by 5?
+Or should 5 first be added with 6, then divided by 2?
+Therefore we need parenthesis to show precedence: @code{5+(6/2)} or
@code{(5+6)/2}.
+Furthermore, if you need to leave a value for later processing, you will need
to define a variable for it; for example, @code{a=(5+6)/2}.
-@item maxvalue
-Maximum value of first operand in the same type, similar to
@command{minvalue}, see the description there for more.
-For example
-@example
-$ astarithmetic image.fits maxvalue -q
-@end example
+Gnuastro provides libraries where you can also use infix notation in C or C++
programs.
+However, Gnuastro's programs are primarily designed to be run on the
command-line and the level of complexity that infix notation requires can be
annoying/confusing to write on the command-line (where they can get confused
with the shell's parenthesis or variable definitions).
+Therefore Gnuastro's Arithmetic and Table (when doing column arithmetic)
programs use the post-fix notation, also known as
@url{https://en.wikipedia.org/wiki/Reverse_Polish_notation, reverse polish
notation}.
+For example, instead of writing @command{5+6}, we write @command{5 6 +}.
-@item numbervalue
-Number of non-blank elements in first operand in the @code{uint64} type (since
it is always a positive integer, see @ref{Numeric data types}).
-Its usage is similar to @command{minvalue}, for example
-@example
-$ astarithmetic image.fits numbervalue -q
-@end example
+The Wikipedia article on the reverse polish notation provides some excellent
explanation on this notation but here we will give a short summary here for
self-sufficiency.
+In short, in the reverse polish notation, the operator is placed after the
operands.
+As we will see below this removes the need to define parenthesis and lets you
use previous values without needing to define a variable.
+In the future@footnote{@url{https://savannah.gnu.org/task/index.php?13867}} we
do plan to also optionally allow infix notation when arithmetic operations on
datasets are desired, but due to time constraints on the developers we cannot
do it immediately.
-@item sumvalue
-Sum of non-blank elements in first operand in the @code{float32} type.
-Its usage is similar to @command{minvalue}, for example
-@example
-$ astarithmetic image.fits sumvalue -q
-@end example
+To easily understand how the reverse polish notation works, you can think of
each operand (@code{5} and @code{6} in the example above) as a node in a
``last-in-first-out'' stack.
+One such stack in daily life is a stack of dishes in the kitchen: you put a
clean dish, on the top of a stack of dishes when it is ready for later usage.
+Later, when you need a dish, you pick the top one (hence the ``last'' dish
placed ``in'' the stack is the ``first'' dish that comes ``out'' when
necessary).
-@item meanvalue
-Mean value of non-blank elements in first operand in the @code{float32} type.
-Its usage is similar to @command{minvalue}, for example
-@example
-$ astarithmetic image.fits meanvalue -q
-@end example
+Each operator will need a certain number of operands (in the example above,
the @code{+} operator needs two operands: @code{5} and @code{6}).
+In the kitchen metaphor, an operator can be an oven.
+Every time an operator is confronted, the operator takes (or ``pops'') the
number of operands it needs from the top of the stack (so they do not exist in
the stack any more), does its operation, and places (or ``pushes'') the result
back on top of the stack.
+So if you want the average of 5 and 6, you would write: @command{5 6 + 2 /}.
+The operations that are done are:
-@item stdvalue
-Standard deviation of non-blank elements in first operand in the
@code{float32} type.
-Its usage is similar to @command{minvalue}, for example
-@example
-$ astarithmetic image.fits stdvalue -q
-@end example
+@enumerate
+@item
+@command{5} is an operand, so Arithmetic pushes it to the top of the stack
(which is initially empty).
+In the kitchen metaphor, you can visualize this as taking a new dish from the
cabinet, putting the number 5 inside of the dish, and putting the dish on top
of the (empty) cooking table in front of you.
+You now have a stack of one dish on the table in front of you.
+@item
+@command{6} is also an operand, so it is pushed to the top of the stack.
+Like before, you can visualize this as taking a new dish from the cabinet,
putting the number 6 in it and placing it on top of the previous dish.
+You now have a stack of two dishes on the table in front of you.
+@item
+@command{+} is a @emph{binary} operator, so it will pop the top two elements
of the stack out of it, and perform addition on them (the order is @mymath{5+6}
in the example above).
+The result is @command{11} which is pushed to the top of the stack.
-@item medianvalue
-Median of non-blank elements in first operand with the same type.
-Its usage is similar to @command{minvalue}, for example
-@example
-$ astarithmetic image.fits medianvalue -q
-@end example
+To visualize this, you can think of the @code{+} operator as an oven with a
place for two dishes.
+You pick up the top-most dish (that has the number 6 in it) and put it in the
oven.
+The top dish is now the one that has the number 5.
+You also pick it up and put it in the oven, and close the oven door.
+When the oven has finished its cooking, it produces a single output (in one
dish, with the number 11 inside of it).
+You take that output dish and put it back on the table.
+You now have a stack of one dish on the table in front of you.
+@item
+@command{2} is an operand so push it onto the top of the stack.
+In the kitchen metaphor, you again go to the cabinet, pick up a dish and put
the number 2 inside of it and put the dish over the previous dish (that has the
number 11).
+You now have a stack of two dishes on the table in front of you.
+@item
+@command{/} (division) is a binary operator, so pull out the top two elements
of the stack (top-most is @command{2}, then @command{11}) and divide the second
one by the first.
+In the kitchen metaphor, the @command{/} operator can be visualized as a
microwave that takes two dishes.
+But unlike the oven (@code{+} operator) before, the order of inputs matters
(they are on top of each other: with the top dish holder being the nominator
and the bottom one being the denominator).
+Again, you look to your stack of dishes on the table.
-@item unique
-Remove all duplicate (and blank) elements from the first popped operand.
-The unique elements of the dataset will be stored in a single-dimensional
dataset.
-
-Recall that by default, single-dimensional datasets are stored as a table
column in the output.
-But you can use @option{--onedasimage} or @option{--onedonstdout} to
respectively store them as a single-dimensional FITS array/image, or to print
them on the standard output.
-
-Although you can use this operator on the floating point dataset, due to
floating-point errors it may give non-reasonable values: because the tenth
digit of the decimal point is also considered although it may be statistically
meaningless, see @ref{Numeric data types}.
-It is therefore better/recommended to use it on the integer dataset like the
labeled images of @ref{Segment output} where each pixel has the integer label
of the object/clump it is associated with.
-For example, let's assume you have cropped a region of a larger labeled image
and want to find the labels/objects that are within the crop.
-With this operator, this job is trivial:
-@example
-$ astarithmetic seg-crop.fits unique
-@end example
-
-@item noblank
-Remove all blank elements from the first popped operand.
-Since the blank pixels are being removed, the output dataset will always be
single-dimensional, independent of the dimensionality of the input.
-
-Recall that by default, single-dimensional datasets are stored as a table
column in the output.
-But you can use @option{--onedasimage} or @option{--onedonstdout} to
respectively store them as a single-dimensional FITS array/image, or to print
them on the standard output.
-
-For example, with the command below, the non-blank pixel values of
@file{cropped.fits} are printed on the command-line (the @option{--quiet}
option is used to remove the extra information that Arithmetic prints as it
reads the inputs, its version and its running time).
-
-@example
-$ astarithmetic cropped.fits noblank --onedonstdout --quiet
-@end example
+You pick up the top one (with value 2 inside of it) and put it in the
microwave's bottom (denominator) dish holder.
+Then you go back to your stack of dishes on the table and pick up the top dish
(with value 11 inside of it) and put that in the top (nominator) dish holder.
+The microwave will do its work and when it is finished, returns a new dish
with the single value 5.5 inside of it.
+You pick up the dish from the microwave and place it back on the table.
-@end table
+@item
+There are no more operands or operators, so simply return the remaining
operand in the output.
+In the kitchen metaphor, you see that your recipe has no more steps, so you
just pick up the remaining dish and take it to the dining room to enjoy a good
dinner.
+@end enumerate
-@node Stacking operators, Filtering operators, Statistical operators,
Arithmetic operators
-@subsubsection Stacking operators
+In the Arithmetic program, the operands can be FITS images of any
dimensionality, or numbers (see @ref{Invoking astarithmetic}).
+In Table's column arithmetic, they can be any column in the table (a series of
numbers in an array) or a single number (see @ref{Column arithmetic}).
-@cindex Stacking
-@cindex Coaddition
-The operators in this section are used when you have multiple datasets that
you would like to merge into one, commonly known as ``stacking'' or
``coaddition''.
-For example, you have taken ten exposures of your scientific target, and you
would like to combine them all into one deep stacked image that is deeper.
+With this notation, very complicated procedures can be created without the
need for parenthesis or worrying about precedence.
+Even functions which take an arbitrary number of arguments can be defined in
this notation.
+This is a very powerful notation and is used in languages like Postscript
@footnote{See the EPS and PDF part of @ref{Recognized file formats} for a
little more on the Postscript language.} which produces PDF files when compiled.
-When calling these operators you should determine how many operands they
should take in (unlike the rest of the operators that have a fixed number of
input operands).
-As described in the first operand below, you do this through their first
popped operand (which should be a single integer number that is larger than
one).
-@table @command
-@cindex NaN
-@item min
-For each pixel, find the minimum value in all given datasets.
-The output will have the same type as the input.
-The first popped operand to this operator must be a positive integer number
which specifies how many further operands should be popped from the stack.
-All the subsequently popped operands must have the same type and size.
-This operator (and all the variable-operand operators similar to it that are
discussed below) will work in multi-threaded mode unless Arithmetic is called
with the @option{--numthreads=1} option, see @ref{Multi-threaded operations}.
+@node Integer benefits and pitfalls, Arithmetic operators, Reverse polish
notation, Arithmetic
+@subsection Integer benefits and pitfalls
-Each pixel of the output of the @code{min} operator will be given the minimum
value of the same pixel from all the popped operands/images.
-For example, the following command will produce an image with the same size
and type as the three inputs, but each output pixel value will be the minimum
of the same pixel's values in all three input images.
+Integers are the simplest numerical data types (@ref{Numeric data types}).
+Because of this, their storage space is much less, and their processing is
much faster than floating point types.
+You can confirm this on your computer with the series of commands below.
+You will make four 5000 by 5000 pixel images filled with random values.
+Two of them will be saved as signed 8-bit integers, and two with 64-bit
floating point types.
+The last command prints the size of the created images.
@example
-$ astarithmetic a.fits b.fits c.fits 3 min --output=min.fits
+$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-1.fits
+$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma int8 -oint-2.fits
+$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-1.fits
+$ astarithmetic 5000 5000 2 makenew 5 mknoise-sigma float64 -oflt-2.fits
+$ ls -lh int-*.fits flt-*.fits
@end example
-Important notes:
-@itemize
-
-@item
-NaN/blank pixels will be ignored, see @ref{Blank pixels}.
-
-@item
-The output will have the same type as the inputs.
-This is natural for the @command{min} and @command{max} operators, but for
other similar operators (for example, @command{sum}, or @command{average}) the
per-pixel operations will be done in double precision floating point and then
stored back in the input type.
-Therefore, if the input was an integer, C's internal type conversion will be
used.
-
-@item
-The operation will be multi-threaded, greatly speeding up the process if you
have large and numerous data to stack.
-You can disable multi-threaded operations with the @option{--numthreads=1}
option (see @ref{Multi-threaded operations}).
-@end itemize
+The 8-bit integer images are only 24MB, while the 64-bit floating point images
are 191 MB!
+Besides helping in storage (on your disk, or in RAM, while the program is
running), the small size of these files also helps in faster reading of the
inputs.
+Furthermore, CPUs can process integer operations much faster than floating
points.
+In the integers, the ones with a smaller width (number of bits) can be
processed much faster. You can see this with the two commands below where you
will add the integer images with each other and the floats with each other:
-@item max
-For each pixel, find the maximum value in all given datasets.
-The output will have the same type as the input.
-This operator is called similar to the @command{min} operator, please see
there for more.
-For example
@example
-$ astarithmetic a.fits b.fits c.fits 3 max -omax.fits
+$ astarithmetic flt-1.fits flt-2.fits + -oflt-sum.fits -g1
+$ astarithmetic int-1.fits int-2.fits + -oint-sum.fits -g1
@end example
+Have a look at the running time of the two commands above (that is printed on
their last line).
+On the system that this paragraph was written on, the floating point and
integer image sums were respectively done in 0.481 and 0.089 seconds (the
integer operation was almost 5 times faster!).
-@item number
-For each pixel count the number of non-blank pixels in all given datasets.
-The output will be an unsigned 32-bit integer datatype (see @ref{Numeric data
types}).
-This operator is called similar to the @command{min} operator, please see
there for more.
-For example
-@example
-$ astarithmetic a.fits b.fits c.fits 3 number -onum.fits
-@end example
+@cartouche
+@noindent
+@strong{If your data does not have decimal points, use integer types:} integer
types are much faster and can take much less space in your storage or RAM
(while the program is running).
+@end cartouche
-Some datasets may have blank values (which are also ignored in all similar
operators like @command{min}, @command{sum}, @command{mean} or
@command{median}).
-Hence, the final pixel values of this operator will not, in general, be equal
to the number of inputs.
-This operator is therefore mostly called in parallel with those operators to
know the ``weight'' of each pixel (in case you want to only keep pixels that
had the full exposure for example).
+@cartouche
+@noindent
+@strong{Select the smallest width that can host the range/precision of
values:} for example, if the largest possible value in your dataset is 1000 and
all numbers are integers, store it as a 16-bit integer.
+Also, if you know the values can never become negative, store it as an
unsigned 16-bit integer.
+For floating point types, if you know you will not need a precision of more
than 6 significant digits, use the 32-bit floating point type.
+For more on the range (for integers) and precision (for floats), see
@ref{Numeric data types}.
+@end cartouche
-@item sum
-For each pixel, calculate the sum in all given datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{min} operator, please see
there for more.
-For example
-@example
-$ astarithmetic a.fits b.fits c.fits 3 sum -ostack-sum.fits
-@end example
+There is a price to be paid for this improved efficiency in integers: your
wisdom!
+If you have not selected your types wisely, strange situations may happen.
+For example, try the command below:
-@item mean
-For each pixel, calculate the mean in all given datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{min} operator, please see
there for more.
-For example
@example
-$ astarithmetic a.fits b.fits c.fits 3 mean -ocoadd-mean.fits
+$ astarithmetic 125 10 +
@end example
-@item std
-For each pixel, find the standard deviation in all given datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{min} operator, please see
there for more.
-For example
-@example
-$ astarithmetic a.fits b.fits c.fits 3 std -ostd.fits
-@end example
+@cindex Integer overflow
+@cindex Overflow, integer
+@noindent
+You expect the output to be @mymath{135}, but it will be @mymath{-121}!
+The reason is that when Arithmetic (or column-arithmetic in Table) confronts a
number on the command-line, it use the principles above to select the most
efficient type for each number.
+Both @mymath{125} and @mymath{10} can safely fit within a signed, 8-bit
integer type, so arithmetic will store both as an 8-bit integer.
+However, the sum (@mymath{135}) is larger than the maximum possible value of
an 8-bit signed integer (@mymath{127}).
+Therefore an integer overflow will occur, and the bits will be over-written.
+As a result, the value will be @mymath{135-128=7} more than the minimum value
of this type (@mymath{-128}), which is @mymath{-128+7=-121}.
+
+When you know situations like this may occur, you can simply use
@ref{Numerical type conversion operators}, to set just one of the inputs to a
wider data type (the smallest, wider type to avoid wasting resources).
+In the example above, this would be @code{uint16}:
-@item median
-For each pixel, find the median in all given datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{min} operator, please see
there for more.
-For example
@example
-$ astarithmetic a.fits b.fits c.fits 3 median \
- --output=stack-median.fits
+$ astarithmetic 125 uint16 10 +
@end example
-@item quantile
-For each pixel, find the quantile from all given datasets.
-The output will have the same numeric data type and size as the input datasets.
-Besides the input datasets, the quantile operator also needs a single
parameter (the requested quantile).
-The parameter should be the first popped operand, with a value between (and
including) 0 and 1.
-The second popped operand must be the number of datasets to use.
+The reason this worked is that @mymath{125} is now converted into an unsigned
16-bit integer before the @code{+} operator.
+Since this is larger than an 8-bit integer, the C programming language's
automatic type conversion will treat both as the wider type and store the
result of the binary operation (@code{+}) in that type.
-In the example below, the first-popped operand (@command{0.7}) is the
quantile, the second-popped operand (@command{3}) is the number of datasets to
pop.
+For such a basic operation like the command above, a faster hack would be any
of the two commands below (which are equivalent).
+This is because @code{125.0} or @code{125.} are interpreted as floating-point
types and they do not suffer from such issues (converting only on one input is
enough):
@example
-astarithmetic a.fits b.fits c.fits 3 0.7 quantile
+$ astarithmetic 125. 10 +
+$ astarithmetic 125.0 10 +
@end example
-@item sigclip-number
-For each pixel, find the sigma-clipped number (after removing outliers) in all
given datasets.
-The output will have the an unsigned 32-bit integer type (see @ref{Numeric
data types}).
+For this particular command, the fix above will be as fast as the
@code{uint16} solution.
+This is because there are only two numbers, and the overhead of Arithmetic
(reading configuration files, etc.) dominates the running time.
+However, for large datasets, the @code{uint16} solution will be faster (as you
saw above), Arithmetic will consume less RAM while running, and the output will
consume less storage in your system (all major benefits)!
-This operator will combine the specified number of inputs into a single output
that contains the number of remaining elements after @mymath{\sigma}-clipping
on each element/pixel (for more on @mymath{\sigma}-clipping, see @ref{Sigma
clipping}).
-This operator is very similar to @command{min}, with the exception that it
expects two operands (parameters for sigma-clipping) before the total number of
inputs.
-The first popped operand is the termination criteria and the second is the
multiple of @mymath{\sigma}.
+It is possible to do internal checks in Gnuastro and catch integer overflows
and correct them internally.
+However, we have not opted for this solution because all those checks will
consume significant resources and slow down the program (especially with large
datasets where RAM, storage and running time become important).
+To be optimal, we therefore trust that you (the wise Gnuastro user!) make the
appropriate type conversion in your commands where necessary (recall that the
operators are available in @ref{Numerical type conversion operators}).
-For example, in the command below, the first popped operand (@command{0.2}) is
the sigma clipping termination criteria.
-If the termination criteria is larger than, or equal to, 1 it is interpreted
as the number of clips to do.
-But if it is between 0 and 1, then it is the tolerance level on the standard
deviation (see @ref{Sigma clipping}).
-The second popped operand (@command{5}) is the multiple of sigma to use in
sigma-clipping.
-The third popped operand (@command{10}) is number of datasets that will be
used (similar to the first popped operand to @command{min}).
+@node Arithmetic operators, Invoking astarithmetic, Integer benefits and
pitfalls, Arithmetic
+@subsection Arithmetic operators
-@example
-astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-number
-@end example
+In this section, list of recognized operators in Arithmetic (and the Table
program's @ref{Column arithmetic}) and discussed in detail with examples.
+As mentioned before, to be able to easily do complex operations on the
command-line, the Reverse Polish Notation is used (where you write
`@mymath{4\quad5\quad+}' instead of `@mymath{4 + 5}'), if you are not already
familiar with it, before continuing, please see @ref{Reverse polish notation}.
-@item sigclip-median
-For each pixel, find the sigma-clipped median in all given datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{sigclip-number} operator,
please see there for more.
-For example
-@example
-astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-median
-@end example
+The operands to all operators can be a data array (for example, a FITS image
or data cube) or a number, the output will be an array or number according to
the inputs.
+For example, a number multiplied by an array will produce an array.
+The numerical data type of the output of each operator is described within it.
+Here are some generic tips and tricks (relevant to all operators):
-@item sigclip-mean
-For each pixel, find the sigma-clipped mean in all given datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{sigclip-number} operator,
please see there for more.
-For example
+@table @asis
+@item Multiple operators in one command
+When you need to use arithmetic commands in several consecutive operations,
you can use one command instead of multiple commands and perform all
calculations in the same command.
+For example, assume you want to apply a threshold of 10 on your image, and
label the connected groups of pixel above this threshold.
+You need two operators for this: @code{gt} (for ``greater than'', see
@ref{Conditional operators}) and @code{connected-components} (see
@ref{Mathematical morphology operators}).
+The bad (non-optimized and slow) way of doing this is to call Arithmetic two
times:
@example
-astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-mean
+$ astarithmetic image.fits 10 gt --output=thresh.fits
+$ astarithmetic thresh.fits 2 connected-components \
+ --output=labeled.fits
+$ rm thresh.fits
@end example
-@item sigclip-std
-For each pixel, find the sigma-clipped standard deviation in all given
datasets.
-The output will have the a single-precision (32-bit) floating point type.
-This operator is called similar to the @command{sigclip-number} operator,
please see there for more.
-For example
+The good (optimal) way is to call them after each other (remember @ref{Reverse
polish notation}):
@example
-astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-std
+$ astarithmetic image.fits 10 gt 2 connected-components \
+ --output=labeled.fits
@end example
-@end table
-
-@node Filtering operators, Pooling operators, Stacking operators, Arithmetic
operators
-@subsubsection Filtering (smoothing) operators
-Image filtering is commonly used for smoothing: every pixel value in the
output image is created by applying a certain statistic to the pixels in its
vicinity.
+You can similarly add any number of operations that must be done sequentially
in a single command and benefit from the speed and lack of intermediate files.
+When your commands become long, you can use the @code{set-AAA} operator to
make it more readable, see @ref{Operand storage in memory or a file}.
-@table @command
-@item filter-mean
-Apply mean filtering (or @url{https://en.wikipedia.org/wiki/Moving_average,
moving average}) on the input dataset.
-During mean filtering, each pixel (data element) is replaced by the mean value
of all its surrounding pixels (excluding blank values).
-The number of surrounding pixels in each dimension (to calculate the mean) is
determined through the earlier operands that have been pushed onto the stack
prior to the input dataset.
-The number of necessary operands is determined by the dimensions of the input
dataset (first popped operand).
-The order of the dimensions on the command-line is the order in FITS format.
-Here is one example:
+@item Blank pixels in Arithmetic
+Blank pixels in the image (see @ref{Blank pixels}) will be stored based on the
data type.
+When the input is floating point type, blank values are NaN.
+One aspect of NaN values is that by definition they will fail on @emph{any}
comparison.
+Also, any operator that includes a NaN as a an operand will produce a NaN
(irrespective of its other operands).
+Hence both equal and not-equal operators will fail when both their operands
are NaN!
+Therefore, the only way to guarantee selection of blank pixels is through the
@command{isblank} operator explained above.
+
+One way you can exploit this property of the NaN value to your advantage is
when you want a fully zero-valued image (even over the blank pixels) based on
an already existing image (with same size and world coordinate system settings).
+The following command will produce this for you:
@example
-$ astarithmetic 5 4 image.fits filter-mean
+$ astarithmetic input.fits nan eq --output=all-zeros.fits
@end example
@noindent
-In this example, each pixel is replaced by the mean of a 5 by 4 box around it.
-The box is 5 pixels along the first FITS dimension (horizontal when viewed in
ds9) and 4 pixels along the second FITS dimension (vertical).
+Note that on the command-line you can write NaN in any case (for example,
@command{NaN}, or @command{NAN} are also acceptable).
+Reading NaN as a floating point number in Gnuastro is not case-sensitive.
+@end table
-Each pixel will be placed in the center of the box that the mean is calculated
on.
-If the given width along a dimension is even, then the center is assumed to be
between the pixels (not in the center of a pixel).
-When the pixel is close to the edge, the pixels of the box that fall outside
the image are ignored.
-Therefore, on the edge, less points will be used in calculating the mean.
+@menu
+* Basic mathematical operators:: For example, +, -, /, log, and pow.
+* Trigonometric and hyperbolic operators:: sin, cos, atan, asinh, etc.
+* Constants:: Physical and Mathematical constants.
+* Unit conversion operators:: Various unit conversions necessary.
+* Statistical operators:: Statistics of a single dataset (for example,
mean).
+* Stacking operators:: Coadding or combining multiple datasets into
one.
+* Filtering operators:: Smoothing a dataset through mixing pixel with
neighbors.
+* Pooling operators:: Reducing size through statistics of pixels in
window.
+* Interpolation operators:: Giving blank pixels a value.
+* Dimensionality changing operators:: Collapse or expand a dataset.
+* Conditional operators:: Select certain pixels within the dataset.
+* Mathematical morphology operators:: Work on binary images, for example,
erode.
+* Bitwise operators:: Work on bits within one pixel.
+* Numerical type conversion operators:: Convert the numeric datatype of a
dataset.
+* Random number generators:: Random numbers can be used to add noise for
example.
+* Box shape operators:: Dealing with box shapes and coordinates of
vertices.
+* Loading external columns:: Read a column from a table into the stack.
+* Size and position operators:: Extracting image size and pixel positions.
+* Building new dataset and stack management:: How to construct an empty
dataset from scratch.
+* Operand storage in memory or a file:: Tools for complex operations in one
command.
+@end menu
-The final effect of mean filtering is to smooth the input image, it is
essentially a convolution with a kernel that has identical values for all its
pixels (is flat), see @ref{Convolution process}.
+@node Basic mathematical operators, Trigonometric and hyperbolic operators,
Arithmetic operators, Arithmetic operators
+@subsubsection Basic mathematical operators
-Note that blank pixels will also be affected by this operator: if there are
any non-blank elements in the box surrounding a blank pixel, in the filtered
image, it will have the mean of the non-blank elements, therefore it will not
be blank any more.
-If blank elements are important for your analysis, you can use the
@code{isblank} operator with the @code{where} operator to set them back to
blank after filtering.
+These are some of the most common operations you will be doing on your data
and include, so no further explanation is necessary.
+If you are new to Gnuastro, just read the description of each carefully.
-For example in the command below, we are first filtering the image, then
setting its original blank elements back to blank in the output of filtering
(all within one Arithmetic command).
-Note how we are using the @code{set-} operator to give names to the temporary
outputs of steps and simplify the code (see @ref{Operand storage in memory or a
file}).
+@table @command
+@item +
+Addition, so ``@command{4 5 +}'' is equivalent to @mymath{4+5}.
+For example, in the command below, the value 20000 is added to each pixel's
value in @file{image.fits}:
@example
-$ astarithmetic image.fits -h1 set-in \
- 5 4 in filter-mean set-filtered \
- filtered in isblank nan where \
- --output=out.fits
+$ astarithmetic 20000 image.fits +
@end example
+You can also use this operator to sum the values of one pixel in two images
(which have to be the same size).
+For example, in the commands below (which are identical, see paragraph after
the commands), each pixel of @file{sum.fits} is the sum of the same pixel's
values in @file{a.fits} and @file{b.fits}.
+@example
+$ astarithmetic a.fits b.fits + -h1 -h1 --output=sum.fits
+$ astarithmetic a.fits b.fits + -g1 --output=sum.fits
+@end example
+The HDU/extension has to be specified for each image with @option{-h}.
+However, if the HDUs are the same in all inputs, you can use @option{-g} to
only specify the HDU once
-@item filter-median
-Apply @url{https://en.wikipedia.org/wiki/Median_filter, median filtering} on
the input dataset.
-This is very similar to @command{filter-mean}, except that instead of the mean
value of the box pixels, the median value is used to replace a pixel value.
-For more on how to use this operator, please see @command{filter-mean}.
-
-The median is less susceptible to outliers compared to the mean.
-As a result, after median filtering, the pixel values will be more
discontinuous than mean filtering.
+If you need to add more than one dataset, one way is to use this operator
multiple times, for example, see the two commands below that are identical in
the Reverse Polish Notation (@ref{Reverse polish notation}):
+@example
+$ astarithmetic a.fits b.fits + c.fits + -osum.fits
+$ astarithmetic a.fits b.fits c.fits + + -osum.fits
+@end example
-@item filter-sigclip-mean
-Apply a @mymath{\sigma}-clipped mean filtering onto the input dataset.
-This is very similar to @code{filter-mean}, except that all outliers
(identified by the @mymath{\sigma}-clipping algorithm) have been removed, see
@ref{Sigma clipping} for more on the basics of this algorithm.
-As described there, two extra input parameters are necessary for
@mymath{\sigma}-clipping: the multiple of @mymath{\sigma} and the termination
criteria.
-@code{filter-sigclip-mean} therefore needs to pop two other operands from the
stack after the dimensions of the box.
+However, this can get annoying/buggy if you have more than three or four
images, in that case, a better way to sum data is to use the @code{sum}
operator (which also ignores blank pixels), that is discussed in @ref{Stacking
operators}.
-For example, the line below uses the same box size as the example of
@code{filter-mean}.
-However, all elements in the box that are iteratively beyond @mymath{3\sigma}
of the distribution's median are removed from the final calculation of the mean
until the change in @mymath{\sigma} is less than @mymath{0.2}.
+@cartouche
+@noindent
+@strong{NaN values:} if a single argument of @code{+} has a NaN value, the
output will also be NaN.
+To ignore NaN values, use the @code{sum} operator of @ref{Stacking operators}.
+You can see the difference with the two commands below:
@example
-$ astarithmetic 3 0.2 5 4 image.fits filter-sigclip-mean
+$ astarithmetic --quiet 1.0 2.0 3.0 nan + + +
+nan
+$ astarithmetic --quiet 1.0 2.0 3.0 nan 4 sum
+6.000000e+00
@end example
-The median (which needs a sorted dataset) is necessary for
@mymath{\sigma}-clipping, therefore @code{filter-sigclip-mean} can be
significantly slower than @code{filter-mean}.
-However, if there are strong outliers in the dataset that you want to ignore
(for example, emission lines on a spectrum when finding the continuum), this is
a much better solution.
+The same goes for all the @ref{Stacking operators} so if your data may include
NaN pixels, be sure to use the stacking operators.
+@end cartouche
-@item filter-sigclip-median
-Apply a @mymath{\sigma}-clipped median filtering onto the input dataset.
-This operator and its necessary operands are almost identical to
@code{filter-sigclip-mean}, except that after @mymath{\sigma}-clipping, the
median value (which is less affected by outliers than the mean) is added back
to the stack.
-@end table
+@item -
+Subtraction, so ``@command{4 5 -}'' is equivalent to @mymath{4-5}.
+Usage of this operator is similar to @command{+} operator, for example:
+@example
+$ astarithmetic 20000 image.fits -
+$ astarithmetic a.fits b.fits - -g1 --output=sub.fits
+@end example
-@node Pooling operators, Interpolation operators, Filtering operators,
Arithmetic operators
-@subsubsection Pooling operators
+@item x
+Multiplication, so ``@command{4 5 x}'' is equivalent to @mymath{4\times5}.
+For example, in the command below, the value of each output pixel is 5 times
its value in @file{image.fits}:
+@example
+$ astarithmetic image.fits 5 x
+@end example
+And you can multiply the value of each pixel in two images, like this:
+@example
+$ astarithmetic a.fits a.fits x -g1 –output=multip.fits
+@end example
-@cindex Pooling
-@cindex Convolutional Neural Networks
-Pooling is one way of reducing the complexity of the input image by grouping
multiple input pixels into one output pixel (using any statistical measure).
-As a result, the output image has fewer pixels (less complexity).
-In Computer Vision, Pooling is commonly used in
@url{https://en.wikipedia.org/wiki/Convolutional_neural_network, Convolutional
Neural Networks} (CNNs).
+@item /
+Division, so ``@command{4 5 /}'' is equivalent to @mymath{4/5}.
+Like the multiplication, for example
+@example
+$ astarithmetic image.fits 5 -h1 /
+$ astarithmetic a.fits b.fits / -g1 –output=div.fits
+@end example
-In pooling, the inputs are an image (e.g., a FITS file) and a square window
pixel size (known as a pooling window).
-The window has to be smaller than the input's number of pixels in both
dimensions and its width is called the pool size.
-This window slides over all pixels in the input from the top-left corner to
the bottom-right corner (covering each input pixel only once).
-Currently the ``stride'' (or spacing between the windows as they slide over
the input) is equal to the window-size in Arithmetic.
-In other words, in pooling, the separate ``windows'' do not overlap with each
other on the input.
-Therefore there are two major differences with @ref{Spatial domain
convolution} or @ref{Filtering operators}, but pooling has some similarities to
the @ref{Warp}.
-@itemize
-@item
-In convolution or filtering the input and output sizes are the same.
-However, the output of pooling has fewer pixels.
-@item
-In convolution or filters, the kernels slide of the input in a pixel-by-pixel
manner.
-As a result, the same pixel's value will be used in many of the output pixels.
-However, in pooling each input pixel is only used in a single output pixel.
-@item
-Special cases of Warping an image are similar to pooling.
-For example calling @code{pool-sum} with pool size of 2 will give the same
pixel values (except the outer edges) as giving the same image to
@command{astwarp} with @option{--scale=1/2 --centeroncorner}.
-However, warping will only provide the sum of the input pixels, there is no
easy way to generically define something like @code{pool-max} in Warp (which is
far more general than pooling).
-Also, due to its generic features (for example for non-linear warps), Warp is
slower than the @code{pool-max} that is introduced here.
-@end itemize
+@item %
+Modulo (remainder), so ``@command{3 2 %}'' will return @mymath{1}.
+Note that the modulo operator only works on integer types (see @ref{Numeric
data types}).
+This operator is therefore not defined for most processed astronomical
astronomical images that have floating-point value.
+However it is useful in labeled images, for example, @ref{Segment output}).
+In such cases, each pixel is the integer label of the object it is associated
with hence with the example command below, we can change the labels to only be
between 1 and 4 and decrease all objects on the image to 4/5th (all objects
with a label that is a multiple of 5 will be set to 0).
+@example
+$ astarithmetic label.fits 5 1 %
+@end example
-@cartouche
-@noindent
-@strong{No WCS in output:} As of Gnuastro @value{VERSION}, the output of
pooling will not contain WCS information (primarily due to a lack of time by
developers).
-Please inform us of your interest in having it, by contacting us at
@command{bug-gnuastro@@gnu.org}.
-If you need @code{pool-sum}, you can use @ref{Warp} (which also modifies the
WCS, see note above).
-@end cartouche
+@item abs
+Absolute value of first operand, so ``@command{4 abs}'' is equivalent to
@mymath{|4|}.
+For example, the output of the command bellow will not have any negative
pixels (all negative pixels will be multiplied by @mymath{-1} to become
positive)
+@example
+$ astarithmetic image.fits abs
+@end example
-If the width or height of input is not divisible by the pool size, the pool
window will go beyond the input pixel grid.
-In this case, the window pixels that do not overlap with the input are given a
blank value (and thus ignored in the calculation of the desired statistical
operation).
-The simple ASCII figure below shows the pooling operation where the input is a
@mymath{3\times3} pixel image with a pool size of 2 pixels.
-In the center of the second row, you see the intermediate input matrix that
highlights how the input and output pixels relate with each other.
-Since the input is @mymath{3\times3} and we have a pool size of 2, as
mentioned above blank pseudo-pixels are added with a value of @code{B} (for
blank).
+@item pow
+First operand to the power of the second, so ``@command{4.3 5 pow}'' is
equivalent to @mymath{4.3^{5}}.
+For example, with the command below all pixels will be squared
+@example
+$ astarithmetic image.fits 2 pow
+@end example
+@item sqrt
+The square root of the first operand, so ``@command{5 sqrt}'' is equivalent to
@mymath{\sqrt{5}}.
+Since the square root is only defined for positive values, any negative-valued
pixel will become NaN (blank).
+The output will have a floating point type, but its precision is determined
from the input: if the input is a 64-bit floating point, the output will also
be 64-bit.
+Otherwise, the output will be 32-bit floating point (see @ref{Numeric data
types} for the respective precision).
+Therefore if you require 64-bit precision in estimating the square root,
convert the input to 64-bit floating point first, for example, with @code{5
float64 sqrt}.
+For example, each pixel of the output of the command below will be the square
root of that pixel in the input.
@example
- Pool window: Input:
- +-----------+ +-------------+
- | | | | 10 12 9 |
- | _ _ | _ _ |___________________________| 31 4 1 |
- | | | || || | 16 5 8 |
- | | | || || +-------------+
- +-----------+ || ||
- The pooling window 2*2 || ||
- stride 2 \/ \/
- +---------------------+
- |/ 10 12\|/ 9 B \|
- | | |
- +-------+ pool-min |\ 31 4 /|\ 1 B /| pool-max +-------+
- | 4 1 | /------ |---------------------| ------\ |31 9 |
- | 5 8 | \------ |/ 16 5 \|/ 8 B \| ------/ |16 8 |
- +-------+ | | | +-------+
- |\ B B /.\ B B /|
- +---------------------+
+$ astarithmetic image.fits sqrt
@end example
-The choice of the statistic to use depends on the specific use case, the
characteristics of the input data, and the desired output.
-Each statistic has its advantages and disadvantages and the choice of which to
use should be informed by the specific needs of the problem at hand.
-Below, the various pool operators of arithmetic are listed:
+If you just want to scale an image with negative values using this operator
(for better visual inspection, and the actual values do not matter for you),
you can subtract the image from its minimum value, then take its square root:
-@table @command
+@example
+$ astarithmetic image.fits image.fits minvalue - sqrt -g1
+@end example
-@item pool-max
-Apply max-pooling on the input dataset.
-This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
-Within the pooling window, this operator will place the largest value in the
output pixel (any blank pixels will be ignored).
+Alternatively, to avoid reading the image into memory two times, you can use
the @option{set-} operator to read it into the variable @option{i} and use
@option{i} two times to speed up the operation (described below):
-See the ASCII diagram above for a demonstration of how max-pooling works.
-Here is an example of using this operator:
+@example
+$ astarithmetic image.fits set-i i i minvalue - sqrt
+@end example
+@item log
+Natural logarithm of first operand, so ``@command{4 log}'' is equivalent to
@mymath{ln(4)}.
+Negative pixels will become NaN, and the output type is determined from the
input, see the explanation under @command{sqrt} for more on these features.
+For example, the command below will take the natural logarithm of every pixel
in the input.
@example
-$ astarithmetic image.fits 2 pool-max
+$ astarithmetic image.fits log --output=log.fits
@end example
-Max-pooling retains the largest value of the input window in the output, so
the returned image is sharper where you have strong signal-to-noise ratio and
more noisy in regions with no significant signal (only noise).
-It is therefore useful when the background of the image is dark and we are
interested in only the highest signal-to-noise ratio regions of the image.
+@item log10
+Base-10 logarithm of first popped operand, so ``@command{4 log}'' is
equivalent to @mymath{log_{10}(4)}.
+Negative pixels will become NaN, and the output type is determined from the
input, see the explanation under @command{sqrt} for more on these features.
+For example, the command below will take the base-10 logarithm of every pixel
in the input.
+@example
+$ astarithmetic image.fits log10
+@end example
+@end table
-@item pool-min
-Apply min-pooling on the input dataset.
-This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
-Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
-
-Min-pooling is mostly used when the image has a high signal-to-noise ratio and
a light background: min-pooling will select darker (lower-valued) pixels.
-For low signal-to-noise regions, this operator will increase the noise level
(similar to the maximum, the scatter in the minimum is very strong).
-
-@item pool-sum
-Apply sum-pooling to the input dataset.
-This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
-Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
+@node Trigonometric and hyperbolic operators, Constants, Basic mathematical
operators, Arithmetic operators
+@subsubsection Trigonometric and hyperbolic operators
-Sum-pooling will increase the signal-to-noise ratio at the cost of having a
smoother output (less resolution).
+All the trigonometric and hyperbolic functions are described here.
+One good thing with these operators is that they take inputs and outputs in
degrees (which we usually need as input or output), not radians (like most
other programs/libraries).
-@item pool-mean
-Apply mean pooling on the input dataset.
-This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
-Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
+@table @command
-The mean pooling method smooths out the image and hence the sharp features may
not be identified when this pooling method is used.
-This therefore preserves more information than max-pooling, but may also
reduces the effect of the most prominent pixels.
-Mean is often used where a more accurate representation of the input is
required.
+@item sin
+@itemx cos
+@itemx tan
+@cindex Trigonometry
+Basic trigonometric functions.
+They take one operand, in units of degrees.
-@item pool-median
-Apply median pooling on the input dataset.
-This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
-Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
+@item asin
+@itemx acos
+@itemx atan
+Inverse trigonometric functions.
+They take one operand and the returned values are in units of degrees.
-In general, the mean is mathematically easier to interpret and more
susceptible to outliers, while the median outputs as being less subject to the
influence of outliers compared to the mean so we have a smoother image.
-This is therefore better for low signal-to-ratio (noisy) features and extended
features (where you don't want a single high or low valued pixel to affect the
output).
-@end table
+@item atan2
+Inverse tangent (output in units of degrees) that uses the signs of the input
coordinates to distinguish between the quadrants.
+This operator therefore needs two operands: the first popped operand is
assumed to be the X axis position of the point, and the second popped operand
is its Y axis coordinate.
-@node Interpolation operators, Dimensionality changing operators, Pooling
operators, Arithmetic operators
-@subsubsection Interpolation operators
+For example, see the commands below.
+To be more clear, we are using Table's @ref{Column arithmetic} which uses
exactly the same internal library function as the Arithmetic program for images.
+We are showing the results for four points in the four quadrants of the 2D
space (if you want to try running them, you do not need to type/copy the parts
after @key{#}).
+The first point (2,2) is in the first quadrant, therefore the returned angle
is 45 degrees.
+But the second, third and fourth points are in the quadrants of the same
order, and the returned angles reflect the quadrant.
-Interpolation is the process of removing blank pixels from a dataset (by
giving them a value based on the non-blank neighbors).
+@example
+$ echo " 2 2" | asttable -c'arith $2 $1 atan2' # --> 45
+$ echo " 2 -2" | asttable -c'arith $2 $1 atan2' # --> -45
+$ echo "-2 -2" | asttable -c'arith $2 $1 atan2' # --> -135
+$ echo "-2 2" | asttable -c'arith $2 $1 atan2' # --> 135
+@end example
-@table @command
+However, if you simply use the classic arc-tangent operator (@code{atan}) for
the same points, the result will only be in two quadrants as you see below:
+@example
+$ echo " 2 2" | asttable -c'arith $2 $1 / atan' # --> 45
+$ echo " 2 -2" | asttable -c'arith $2 $1 / atan' # --> -45
+$ echo "-2 -2" | asttable -c'arith $2 $1 / atan' # --> 45
+$ echo "-2 2" | asttable -c'arith $2 $1 / atan' # --> -45
+@end example
-@item interpolate-medianngb
-Interpolate the blank elements of the second popped operand with the median of
nearest non-blank neighbors to each.
-The number of the nearest non-blank neighbors used to calculate the median is
given by the first popped operand.
+@item sinh
+@itemx cosh
+@itemx tanh
+@cindex Hyperbolic functions
+Hyperbolic sine, cosine, and tangent.
+These operators take a single operand.
-The distance of the nearest non-blank neighbors is irrelevant in this
interpolation.
-The neighbors of each blank pixel will be parsed in expanding circular rings
(for 2D images) or spherical surfaces (for 3D cube) and each non-blank element
over them is stored in memory.
-When the requested number of non-blank neighbors have been found, their median
is used to replace that blank element.
-For example, the line below replaces each blank element with the median of the
nearest 5 pixels.
+@item asinh
+@itemx acosh
+@itemx atanh
+Inverse Hyperbolic sine, cosine, and tangent.
+These operators take a single operand.
+@end table
-@example
-$ astarithmetic image.fits 5 interpolate-medianngb
-@end example
+@node Constants, Unit conversion operators, Trigonometric and hyperbolic
operators, Arithmetic operators
+@subsubsection Constants
+@cindex Pi
+During your analysis it is often necessary to have certain constants like the
number @mymath{\pi}.
+The ``operators'' in this section do not actually take any operand, they just
replace the desired constant into the stack.
+So in effect, these are actually operands.
+But since their value is not inserted by the user, we have placed them in the
list of operators.
-When you want to interpolate blank regions and you want each blank region to
have a fixed value (for example, the centers of saturated stars) this operator
is not good.
-Because the pixels used to interpolate various parts of the region differ.
-For such scenarios, you may use @code{interpolate-maxofregion} or
@code{interpolate-inofregion} (described below).
+@table @code
+@item e
+@cindex e (base of natural logarithm)
+@cindex Euler's number (@mymath{e})
+@cindex Base of natural logarithm (@mymath{e})
+Euler’s number, or the base of the natural logarithm (no units).
+See @url{https://en.wikipedia.org/wiki/E_(mathematical_constant), Wikipedia}.
-@item interpolate-meanngb
-Similar to @code{interpolate-medianngb}, but will fill the blank values of the
dataset with the mean value of the requested number of nearest neighbors.
+@item pi
+@cindex Pi
+Ratio of circle’s circumference to its diameter (no units).
+See @url{https://en.wikipedia.org/wiki/Pi, Wikipedia}.
-@item interpolate-minngb
-Similar to @code{interpolate-medianngb}, but will fill the blank values of the
dataset with the minimum value of the requested number of nearest neighbors.
+@item c
+@cindex Speed of light
+The speed of light in vacuum, in units of @mymath{m/s}.
+see @url{https://en.wikipedia.org/wiki/Speed_of_light, Wikipedia}.
-@item interpolate-maxngb
-Similar to @code{interpolate-medianngb}, but will fill the blank values of the
dataset with the maximum value of the requested number of nearest neighbors.
-One useful implementation of this operator is to fill the saturated pixels of
stars in images.
+@item G
+@cindex @mymath{g} (gravitational constant)
+@cindex Gravitational constant (@mymath{g})
+The gravitational constant, in units of @mymath{m^3/kg/s^2}.
+See @url{https://en.wikipedia.org/wiki/Gravitational_constant, Wikipedia}.
-@item interpolate-minofregion
-Interpolate all blank regions (consisting of many blank pixels that are
touching) in the second popped operand with the minimum value of the pixels
that are immediately bordering that region (a single value).
-The first popped operand is the connectivity (see description in
@command{connected-components}).
+@item h
+@cindex @mymath{h} (Plank's constant)
+@cindex Plank's constant (@mymath{h})
+Plank's constant, in units of @mymath{J/Hz} or @mymath{kg\times m^2/s}.
+See @url{https://en.wikipedia.org/wiki/Planck_constant, Wikipedia}.
-For example, with the command below all the connected blank regions of
@file{image.fits} will be filled.
-Its an image (2D dataset), so a 2 connectivity means that the independent
blank regions are defined by 8-connected neighbors.
-If connectivity was 1, the regions would be defined by 4-connectivity: blank
regions that may only be touching on the corner of one pixel would be
identified as separate regions.
+@item au
+@cindex Astronomical Unit (AU)
+@cindex AU (Astronomical Unit)
+Astronomical Unit, in units of meters.
+See @url{https://en.wikipedia.org/wiki/Astronomical_unit, Wikipedia}.
-@example
-$ astarithmetic image.fits 2 interpolate-minofregion
-@end example
+@item ly
+@cindex Light year
+Distance covered by light in vacuum in one year, in units of meters.
+See @url{https://en.wikipedia.org/wiki/Light-year, Wikipedia}.
-@item interpolate-maxofregion
-@cindex Saturated pixels
-Similar to @code{interpolate-minofregion}, but the maximum is used to fill the
blank regions.
+@item avogadro
+@cindex Avogradro's number
+Avogadro's constant, in units of @mymath{1/mol}.
+See @url{https://en.wikipedia.org/wiki/Avogadro_constant, Wikipedia}.
-This operator can be useful in filling saturated pixels in stars for example.
-Recall that the @option{interpolate-maxngb} operator looks for the maximum
value with a given number of neighboring pixels and is more useful in small
noisy regions.
-Therefore as the blank regions become larger, @option{interpolate-maxngb} can
cause a fragmentation in the connected blank region because the nearest
neighbor to one part of the blank region, may not fall within the pixels
searched for the other regions.
-With this option, the size of the blank region is irrelevant: all the pixels
bordering the blank region are parsed and their maximum value is used for the
whole region.
+@item fine-structure
+@cindex Fine structure constant
+The fine-structure constant (no units).
+See @url{https://en.wikipedia.org/wiki/Fine-structure_constant, Wikipedia}.
@end table
-@node Dimensionality changing operators, Conditional operators, Interpolation
operators, Arithmetic operators
-@subsubsection Dimensionality changing operators
+@node Unit conversion operators, Statistical operators, Constants, Arithmetic
operators
+@subsubsection Unit conversion operators
-Through these operators you can change the dimensions of the output through
certain statistics on the dimensions that should be removed.
-For example, let's assume you have a 3D data cube that has 300 by 300 pixels
in the RA and Dec dimensions (first two dimensions), and 3600 slices along the
wavelength (third dimension), so the whole cube is
@mymath{300\times300\times3600} voxels (volume elements).
-To create a narrow-band image that only contains 100 slices around a certain
wavelength, you can crop that section (using @ref{Crop}), giving you a
@mymath{300\times300\times100} cube.
-You can now use the @code{collapse-sum} operator below to ``collapse'' all the
100 slices into one 2D image that has @mymath{300\times300} pixels.
-Every pixel in this 2D image will have the flux of the sum of the 100 slices.
+It often happens that you have data in one unit (for example, counts on your
CCD), but would like to convert it into another (for example, magnitudes, to
measure the brightness of a galaxy).
+While the equations for the unit conversions can be easily found on the
internet, the operators in this section are designed to simplify the process
and let you do it easily and fast without having to remember constants and
relations.
@table @command
-@item to-1d
-Convert the input operand into a 1D array; irrespective of the number of
dimensions it has.
-This operator only takes a single operand (the input array) and just updates
the metadata.
-Therefore it does not change the layout of the array contents in memory and is
very fast.
-
-If no further operation is requested on the 1D array, recall that Arithmetic
will write a 1D array as a table column by default.
-In case you want the output to be saved as a 1D image, or to see it on the
standard output, please use the @code{--onedasimage} or @code{--onedonstdout}
options respectively (see @ref{Invoking astarithmetic}).
+@item counts-to-mag
+Convert counts (usually CCD outputs) to magnitudes using the given zero point.
+The zero point is the first popped operand and the count image or value is the
second popped operand.
-This operator is useful in scenarios where after some operations on a 2D image
or 3D cube, the dimensionality is no longer relevant for you and you just care
about the values.
-In the example below, we will first make a simple 2D image from a plain-text
file, then convert it to a 1D array:
+For example, assume you have measured the standard deviation of the noise in
an image to be @code{0.1} counts, and the image's zero point is @code{22.5} and
you want to measure the @emph{per-pixel} surface brightness limit of the
dataset@footnote{The @emph{per-pixel} surface brightness limit is the magnitude
of the noise standard deviation. For more on surface brightness see
@ref{Brightness flux magnitude}.
+In the example command, because the output is a single number, we are using
@option{--quiet} to avoid printing extra information.}.
+To apply this operator on an image, simply replace @code{0.1} with the image
name, as described below.
@example
-## Contents of 'a.txt' to start with.
-$ cat a.txt
-# Image 1: DEMO [counts, uint8] An example image
-1 2 3
-4 5 6
-7 8 9
+$ astarithmetic 0.1 22.5 counts-to-mag --quiet
+@end example
-## Convert the text image into a FITS image.
-$ astconvertt a.txt -o a.fits
+Of course, you can also convert every pixel in an image (or table column in
Table's @ref{Column arithmetic}) with this operator if you replace the second
popped operand with an image/column name.
+For an example of applying this operator on an image, see the description of
surface brightness in @ref{Brightness flux magnitude}, where we will convert an
image's pixel values to surface brightness.
-## Convert it into a table column (1D):
-$ astarithmetic a.fits to-1d -o table.fits
+@item mag-to-counts
+Convert magnitudes to counts (usually CCD outputs) using the given zero point.
+The zero point is the first popped operand and the magnitude value is the
second.
+For example, if an object has a magnitude of 20, you can estimate the counts
corresponding to it (when the image has a zero point of 24.8) with this command:
+Note that because the output is a single number, we are using @option{--quiet}
to avoid printing extra information.
-## Convert it into a 1D image:
-$ astarithmetic a.fits to-1d -o table.fits --onedasimage
+@example
+$ astarithmetic 20 24.8 mag-to-counts --quiet
@end example
-@cindex Flattening (CNNs)
-A more real-world example would be the following: assume you want to
``flatten'' two images into a single 1D array (as commonly done in
convolutional neural networks, or
CNNs@footnote{@url{https://en.wikipedia.org/wiki/Convolutional_neural_network}}).
-First, we show the contents of a new @mymath{2\times2} image in plain-text
image, then convert it to a 2D FITS image (@file{b.fits}).
-We will then use arithmetic to make both @file{a.fits} (from the example
above) and @file{b.fits} into a 1D array and stitch them together into a single
1D image with one call to Arithmetic.
-For a description of the @code{stitch} operator, see below (same section).
+@item counts-to-sb
+Convert counts to surface brightness using the zero point and area (in units
of arcsec@mymath{^2}).
+The first popped operand is the area (in arcsec@mymath{^2}), the second popped
operand is the zero point and the third are the count values.
+Estimating the surface brightness involves taking the logarithm.
+Therefore this operator will produce NaN for counts with a negative value.
-@example
-## Contents of 'b.txt':
-$ cat b.txt
-# Image 1: DEMO [counts, uint8] An example image
-10 11
-12 13
+For example, with the commands below, we read the zero point from the image
headers (assuming it is in the @code{ZPOINT} keyword), we calculate the pixel
area from the image itself, and we call this operator to convert the image
pixels (in counts) to surface brightness (mag/arcsec@mymath{^2}).
-## Convert the text image into a FITS image.
-$ astconvertt b.txt -o b.fits
+@example
+$ zeropoint=$(astfits image.fits --keyvalue=ZPOINT -q)
+$ pixarea=$(astfits image.fits --pixelareaarcsec2)
+$ astarithmetic image.fits $zeropoint $pixarea counts-to-sb \
+ --output=image-sb.fits
-# Flatten the two images into a single 1D image:
-$ astarithmetic a.fits to-1d b.fits to-1d 2 1 stitch -g1 \
- --onedonstdout --quiet
-1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
@end example
+For more on the definition of surface brightness see @ref{Brightness flux
magnitude}, and for a fully tutorial on optimal usage of this, see @ref{FITS
images in a publication}.
-@item stitch
-Stitch (connect) any number of given images together along the given dimension.
-The output has the same number of dimensions as the input, but the number of
pixels along the requested dimension will be different from the inputs.
-The @code{stitch} operator takes at least three operands:
-@itemize
-@item
-The first popped operand (placed just before @code{stitch}) is the direction
(dimension) that the images should be stitched along.
-The first FITS dimension is along the horizontal, therefore a value of
@code{1} will stitch them horizontally.
-Similarly, giving a value of @code{2} will result in a vertical stitch.
-
-@item
-The second popped operand is the number of images that should be stitched.
-
-@item
-Depending on the value given to the second popped operand, @code{stitch} will
pop the given number of datasets from the stack and stitch them along the given
dimension.
-The popped images have to have the same number of pixels along the other
dimension.
-The order of the stitching is defined by how they are placed in the
command-line, not how they are popped (after being popped, they are placed in a
list in the same order).
-@end itemize
-
-For example, in the commands below, we will first crop out fixed sized regions
of @mymath{100\times300} pixels of a larger image (@file{large.fits}) first.
-In the first call of Arithmetic below, we will stitch the bottom set of crops
together along the first (horizontal) axis.
-In the second Arithmetic call, we will stitch all 6 along both dimensions.
+@item sb-to-counts
+Convert surface brightness using the zero point and area (in units of
arcsec@mymath{^2}) to counts.
+The first popped operand is the area (in arcsec@mymath{^2}), the second popped
operand is the zero point and the third are the surface brightness values.
+See the description of @command{counts-to-sb} for more.
+@item mag-to-sb
+Convert magnitudes to surface brightness over a certain area (in units of
arcsec@mymath{^2}).
+The first popped operand is the area and the second is the magnitude.
+For example, let's assume you have a table with the two columns of magnitude
(called @code{MAG}) and area (called @code{AREAARCSEC2}).
+In the command below, we will use @ref{Column arithmetic} to return the
surface brightness.
@example
-## Crop the fixed-size regions of a larger image ('-O' is the
-## short form of the '--mode' option).
-$ astcrop large.fits -Oimg --section=1:100,1:300 -oa.fits
-$ astcrop large.fits -Oimg --section=101:200,1:300 -ob.fits
-$ astcrop large.fits -Oimg --section=201:300,1:300 -oc.fits
-$ astcrop large.fits -Oimg --section=1:100,301:600 -od.fits
-$ astcrop large.fits -Oimg --section=101:200,301:600 -oe.fits
-$ astcrop large.fits -Oimg --section=201:300,301:600 -of.fits
-
-## Stitch the bottom three crops into one image.
-$ astarithmetic a.fits b.fits c.fits 3 1 stitch -obottom.fits
-
-# Stitch all the 6 crops along both dimensions
-$ astarithmetic a.fits b.fits c.fits 3 1 stitch \
- d.fits e.fits f.fits 3 1 stitch \
- 2 2 stitch -g1 -oall.fits
+$ asttable table.fits -c'arith MAG AREAARCSEC2 mag-to-sb'
@end example
-The start of the last command is like the one before it (stitching the bottom
three crops along the first FITS dimension, producing a @mymath{300\times300}
image).
-Later in the same command, we then stitch the top three crops horizontally
(again, into a @mymath{300\times300} image)
-This leaves the the two @mymath{300\times300} images on the stack (see
@ref{Reverse polish notation}).
-We finally stitch those two along the second (vertical) dimension.
-This operator is therefore useful in scenarios like placing the CCD amplifiers
into one image.
+@item sb-to-mag
+Convert surface brightness to magnitudes over a certain area (in units of
arcsec@mymath{^2}).
+The first popped operand is the area and the second is the magnitude.
+See the description of @code{mag-to-sb} for more.
-@item trim
-Trim all blank elements from the outer edges of the input operand (it only
takes a single operand).
-For example see the commands below using Table's @ref{Column arithmetic}:
+@item counts-to-jy
+@cindex AB magnitude
+@cindex Magnitude, AB
+Convert counts (usually CCD outputs) to Janskys through an AB-magnitude based
zero point.
+The top-popped operand is assumed to be the AB-magnitude zero point and the
second-popped operand is assumed to be a dataset in units of counts (an image
in Arithmetic, and a column in Table's @ref{Column arithmetic}).
+For the full equation and basic definitions, see @ref{Brightness flux
magnitude}.
-@example
-$ cat table.txt
-nan
-nan
-nan
-3
-4
-nan
-5
-6
-nan
+@cindex SDSS
+@cindex Nanomaggy
+For example, SDSS images are calibrated in units of nanomaggies, with a fixed
zero point magnitude of 22.5.
+Therefore you can convert the units of SDSS image pixels to Janskys with the
command below:
-$ asttable table.txt -Y -c'arith $1 trim'
-3.000000
-4.000000
-nan
-5.000000
-6.000000
+@example
+$ astarithmetic sdss-image.fits 22.5 counts-to-jy
@end example
-Similarly, on 2D images or 3D cubes, all outer rows/columns or slices that are
fully blank get ``trim''ed with this operator.
-This is therefore a very useful operator for extracting a certain feature
within your dataset.
+@item jy-to-counts
+Convert Janskys to counts (usually CCD outputs) through an AB-magnitude based
zero point.
+This is the inverse operation of the @code{counts-to-jy}, see there for usage
example.
-For example, let's assume that you have set @ref{NoiseChisel} and
@ref{Segment} on an image to extract all clumps and objects.
-With the command below on Segment's output, you will have a smaller image that
only contains the sky-subtracted input pixels corresponding to object 263.
+@item counts-to-nanomaggy
+@cindex Nanomaggy
+Convert counts to Nanomaggy (with fixed zero point of 22.5, used as the pixel
units of many surveys like SDSS).
+For example if your image has a zero point of 24.93, you can convert it to
Nanomaggies with the command below:
@example
-$ astarithmetic seg.fits -hINPUT seg.fits -hOBJECTS \
- 263 ne nan where trim --output=obj-263.fits
+$ astarithmetic image.fits 24.93 counts-to-nanomaggy
@end example
-@item collapse-sum
-Collapse the given dataset (second popped operand), by summing all elements
along the first popped operand (a dimension in FITS standard: counting from
one, from fastest dimension).
-The returned dataset has one dimension less compared to the input.
-
-The output will have a double-precision floating point type irrespective of
the input dataset's type.
-Doing the operation in double-precision (64-bit) floating point will help the
collapse (summation) be affected less by floating point errors.
-But afterwards, single-precision floating points are usually enough in real
(noisy) datasets.
-So depending on the type of the input and its nature, it is recommended to use
one of the type conversion operators on the returned dataset.
-
-@cindex World Coordinate System (WCS)
-If any WCS is present, the returned dataset will also lack the respective
dimension in its WCS matrix.
-Therefore, when the WCS is important for later processing, be sure that the
input is aligned with the respective axes: all non-diagonal elements in the WCS
matrix are zero.
-
-@cindex Data cubes
-@cindex 3D data-cubes
-@cindex Cubes (3D data)
-@cindex Narrow-band image
-@cindex IFU: Integral Field Unit
-@cindex Integral field unit (IFU)
-One common application of this operator is the creation of pseudo broad-band
or narrow-band 2D images from 3D data cubes.
-For example, integral field unit (IFU) data products that have two spatial
dimensions (first two FITS dimensions) and one spectral dimension (third FITS
dimension).
-The command below will collapse the whole third dimension into a 2D array the
size of the first two dimensions, and then convert the output to
single-precision floating point (as discussed above).
+@item nanomaggy-to-counts
+@cindex Nanomaggy
+Convert Nanomaggy to counts.
+Nanomaggy is defined to have a fixed zero point of 22.5 and is the pixel units
of many surveys like SDSS.
+For example if you would like to convert an image in units of Nanomaggy (for
example from SDSS) to the counts of a camera with a zero point of 25.92, you
can use the command below:
@example
-$ astarithmetic cube.fits 3 collapse-sum float32
+$ astarithmetic image.fits 25.92 nanomaggy-to-counts
@end example
-@item collapse-mean
-Similar to @option{collapse-sum}, but the returned dataset will be the mean
value along the collapsed dimension, not the sum.
+@item mag-to-jy
+Convert AB magnitudes to Janskys, see @ref{Brightness flux magnitude}.
-@item collapse-number
-Similar to @option{collapse-sum}, but the returned dataset will be the number
of non-blank values along the collapsed dimension.
-The output will have a 32-bit signed integer type.
-If the input dataset does not have blank values, all the elements in the
returned dataset will have a single value (the length of the collapsed
dimension).
-Therefore this is mostly relevant when there are blank values in the dataset.
+@item jy-to-mag
+Convert Janskys to AB magnitude, see @ref{Brightness flux magnitude}.
-@item collapse-min
-Similar to @option{collapse-sum}, but the returned dataset will have the same
numeric type as the input and will contain the minimum value for each pixel
along the collapsed dimension.
+@item au-to-pc
+@cindex Parsecs
+@cindex Astronomical Units (AU)
+Convert Astronomical Units (AUs) to Parsecs (PCs).
+This operator takes a single argument which is interpreted to be the input AUs.
+The conversion is based on the definition of Parsecs: @mymath{1 \rm{PC} =
1/tan(1^{\prime\prime}) \rm{AU}}, where @mymath{1^{\prime\prime}} is one
arcseconds.
+In other words, @mymath{1 (\rm{PC}) = 648000/\pi (\rm{AU})}.
+For example, if we take Pluto's average distance to the Sun to be 40 AUs, we
can obtain its distance in Parsecs using this command:
-@item collapse-max
-Similar to @option{collapse-sum}, but the returned dataset will have the same
numeric type as the input and will contain the maximum value for each pixel
along the collapsed dimension.
+@example
+echo 40 | asttable -c'arith $1 au-to-pc'
+@end example
-@item collapse-median
-Similar to @option{collapse-sum}, but the returned dataset will have the same
numeric type as the input and will contain the median value for each pixel
along the collapsed dimension.
+@item pc-to-au
+Convert Parsecs (PCs) to Astronomical Units (AUs).
+This operator takes a single argument which is interpreted to be the input PCs.
+For more on the conversion equation, see description of @code{au-to-pc}.
+For example, Proxima Centauri (the nearest star to the Solar system) is 1.3020
Parsecs from the Sun, we can calculate this distance in units of AUs with the
command below:
-The median involves sorting, therefore @code{collapse-median} will do each
calculation in different CPU threads to speed up the operation.
-By default, Arithmetic will detect and use all available threads, but you can
override this with the @option{--numthreads} (or @option{-N}) option.
+@example
+echo 1.3020 | asttable -c'arith $1 pc-to-au'
+@end example
-@item collapse-sigclip-mean
-Collapse the input dataset (fourth popped operand) along the FITS dimension
given as the first popped operand by calculating the sigma-clipped mean.
-The sigma-clipping parameters (namely, the multiple of sigma and termination
criteria) are read as the third and second popped operands respectively.
-For more on sigma-clipping, see @ref{Sigma clipping}.
+@item ly-to-pc
+@cindex Light-year
+Convert Light-years (LY) to Parsecs (PCs).
+This operator takes a single argument which is interpreted to be the input LYs.
+The conversion is done from IAU's definition of the light-year
(9460730472580800 m @mymath{\approx} 63241.077 AU = 0.306601 PC, for the
conversion of AU to PC, see the description of @code{au-to-pc}).
-For example, with the command below, the pixels of the input 2 dimensional
@file{image.fits} will be collapsed to a single dimension output.
-The first popped operand is @code{2}, so it will collapse all the pixels that
are vertically on top of each other.
-Such that the output will have the same number of pixels as the horizontal
axis of the input.
-During the collapsing, all pixels that are more than @mymath{3\sigma} (third
popped operand) are rejected, and the clipping will continue until the standard
deviation changes less than @mymath{0.2} between clips.
+For example, the distance of Andromeda galaxy to our galaxy is 2.5 million
light-years, so its distance in kilo-Parsecs can be calculated with the command
below (note that we want the output in kilo-parsecs, so we are dividing the
output of this operator by 1000):
@example
-$ astarithmetic image.fits 3 0.2 2 collapse-sigclip-mean \
- --output=collapsed-vertical.fits
+echo 2.5e6 | asttable -c'arith $1 ly-to-pc 1000 /'
@end example
-@cartouche
-@noindent
-@strong{Printing output of collapse in plain-text:} the default datatype of
@code{collapse-sigclip-mean} is 32-bit floating point.
-This is sufficient for any observed astronomical data.
-However, if you request a plain-text output, or decide to print/view the
output as plain-text on the standard output, the full set of decimals may not
be printed in some situations.
-This can lead to apparently discrete values in the output of this operator
when viewed in plain-text!
-The FITS format is always superior (since it stores the value in binary,
therefore not having the problem above).
-But if you are forced to save the output in plain-text, use the @code{float64}
operator after this to change the type to 64-bit floating point (which will
print more decimals).
-@end cartouche
-
-@item collapse-sigclip-std
-Collapse the input dataset along the given FITS dimension by calculating the
sigma-clipped standard deviation.
-Except for returning the standard deviation after clipping, this function is
similar to @code{collapse-sigclip-mean}, see the description of that operator
for more.
+@item pc-to-ly
+Convert Parsecs (PCs) to Light-years (LY).
+This operator takes a single argument which is interpreted to be the input PCs.
+For the conversion and an example of the inverse of this operator, see the
description of @code{ly-to-pc}.
-@item collapse-sigclip-median
-Collapse the input dataset along the given FITS dimension by calculating the
sigma-clipped median.
-Except for returning the median after clipping, this function is similar to
@code{collapse-sigclip-mean}, see the description of that operator for more.
+@item ly-to-au
+Convert Light-years (LY) to Astronomical Units (AUs).
+This operator takes a single argument which is interpreted to be the input LYs.
+For the conversion and a similar example, see the description of
@code{ly-to-pc}.
-@item collapse-sigclip-number
-Collapse the input dataset along the given FITS dimension by calculating the
number of elements that remain after sigma-clipped.
-Except for returning the number after clipping, this function is similar to
@code{collapse-sigclip-mean}, see the description of that operator for more.
+@item au-to-ly
+Convert Astronomical Units (AUs) to Light-years (LY).
+This operator takes a single argument which is interpreted to be the input AUs.
+For the conversion and a similar example, see the description of
@code{ly-to-pc}.
+@end table
-@item add-dimension-slow
-Build a higher-dimensional dataset from all the input datasets stacked after
one another (along the slowest dimension).
-The first popped operand has to be a single number.
-It is used by the operator to know how many operands it should pop from the
stack (and the size of the output in the new dimension).
-The rest of the operands must have the same size and numerical data type.
-This operator currently only works for 2D input operands, please contact us if
you want inputs to have different dimensions.
+@node Statistical operators, Stacking operators, Unit conversion operators,
Arithmetic operators
+@subsubsection Statistical operators
-The output's WCS (which should have a different dimensionality compared to the
inputs) can be read from another file with the @option{--wcsfile} option.
-If no file is specified for the WCS, the first dataset's WCS will be used, you
can later add/change the necessary WCS keywords with the FITS keyword
modification features of the Fits program (see @ref{Fits}).
+The operators in this section take a single dataset as input, and will return
the desired statistic as a single value.
-If your datasets do not have the same type, you can use the type
transformation operators of Arithmetic that are discussed below.
-Just beware of overflow if you are transforming to a smaller type, see
@ref{Numeric data types}.
+@table @command
-For example, let's assume you have 3 two-dimensional images @file{a.fits},
@file{b.fits} and @file{c.fits} (each with @mymath{200\times100} pixels).
-You can construct a 3D data cube with @mymath{200\times100\times3} voxels
(volume-pixels) using the command below:
+@item minvalue
+Minimum value in the first popped operand, so ``@command{a.fits minvalue}''
will push the minimum pixel value in this image onto the stack.
+When this operator acts on a single image, the output (operand that is put
back on the stack) will no longer be an image, but a number.
+The output of this operand is in the same type as the input.
+This operator is mainly intended for multi-element datasets (for example,
images or data cubes), if the popped operand is a number, it will just return
it without any change.
+Note that when the final remaining/output operand is a single number, it is
printed onto the standard output.
+For example, with the command below the minimum pixel value in
@file{image.fits} will be printed in the terminal:
@example
-$ astarithmetic a.fits b.fits c.fits 3 add-dimension-slow
+$ astarithmetic image.fits minvalue
@end example
-@item add-dimension-fast
-Similar to @code{add-dimension-slow} but along the fastest dimension.
-This operator currently only works for 1D input operands, please contact us if
you want inputs to have different dimensions.
-
-For example, let's assume you have 3 one-dimensional datasets, each with 100
elements.
-With this operator, you can construct a @mymath{3\times100} pixel FITS image
that has 3 pixels along the horizontal and 5 pixels along the vertical.
-@end table
+However, the output above also includes a lot of extra information that are
not relevant in this context.
+If you just want the final number, run Arithmetic in quiet mode:
+@example
+$ astarithmetic image.fits minvalue -q
+@end example
-@node Conditional operators, Mathematical morphology operators, Dimensionality
changing operators, Arithmetic operators
-@subsubsection Conditional operators
+Also see the description of @option{sqrt} for other example usages of this
operator.
-Conditional operators take two inputs and return a binary output that can only
have two values 0 (for pixels where the condition was false) or 1 (for the
pixels where the condition was true).
-Because of the binary (2-valued) nature of their outputs, the output is
therefore stored in an @code{unsigned char} data type (see @ref{Numeric data
types}) to speed up process and take less space in your storage.
-There are two exceptions to the general features above: @code{isblank} only
takes one input, and @code{where} takes three, while not returning a binary
output, see their description for more.
-
-@table @command
-@item lt
-Less than: creates a binary output (values either 0 or 1) where each pixel
will be 1 if the second popped operand is smaller than the first popped operand
and 0 otherwise.
-If both operands are images, then all the pixels will be compared with their
counterparts in the other image.
-
-For example, the pixels in the output of the command below will have a value
of 1 (true) if their value in @file{image1.fits} is less than their value in
@file{image2.fits}.
-Otherwise, their value will be 0 (false).
-@example
-$ astarithmetic image1.fits image2.fits lt
-@end example
-If only one operand is an image, then all the pixels will be compared with the
single value (number) of the other operand.
-For example:
-@example
-$ astarithmetic image1.fits 1000 lt
-@end example
-Finally if both are numbers, then the output is also just one number (0 or 1).
-@example
-$ astarithmetic 4 5 lt
-@end example
-
-
-@item le
-Less or equal: similar to @code{lt} (`less than' operator), but returning 1
when the second popped operand is smaller or equal to the first.
+@item maxvalue
+Maximum value of first operand in the same type, similar to
@command{minvalue}, see the description there for more.
For example
@example
-$ astarithmetic image1.fits 1000 le
+$ astarithmetic image.fits maxvalue -q
@end example
-@item gt
-Greater than: similar to @code{lt} (`less than' operator), but returning 1
when the second popped operand is greater than the first.
-For example
+@item numbervalue
+Number of non-blank elements in first operand in the @code{uint64} type (since
it is always a positive integer, see @ref{Numeric data types}).
+Its usage is similar to @command{minvalue}, for example
@example
-$ astarithmetic image1.fits 1000 gt
+$ astarithmetic image.fits numbervalue -q
@end example
-@item ge
-Greater or equal: similar to @code{lt} (`less than' operator), but returning 1
when the second popped operand is larger or equal to the first.
-For example
+@item sumvalue
+Sum of non-blank elements in first operand in the @code{float32} type.
+Its usage is similar to @command{minvalue}, for example
@example
-$ astarithmetic image1.fits 1000 ge
+$ astarithmetic image.fits sumvalue -q
@end example
-@item eq
-Equality: similar to @code{lt} (`less than' operator), but returning 1 when
the two popped operands are equal (to double precision floating point accuracy).
+@item meanvalue
+Mean value of non-blank elements in first operand in the @code{float32} type.
+Its usage is similar to @command{minvalue}, for example
@example
-$ astarithmetic image1.fits 1000 eq
+$ astarithmetic image.fits meanvalue -q
@end example
-@item ne
-Non-Equality: similar to @code{lt} (`less than' operator), but returning 1
when the two popped operands are @emph{not} equal (to double precision floating
point accuracy).
+@item stdvalue
+Standard deviation of non-blank elements in first operand in the
@code{float32} type.
+Its usage is similar to @command{minvalue}, for example
@example
-$ astarithmetic image1.fits 1000 ne
+$ astarithmetic image.fits stdvalue -q
@end example
-@item and
-Logical AND: returns 1 if both operands have a non-zero value and 0 if both
are zero.
-Both operands have to be the same kind: either both images or both numbers and
it mostly makes meaningful values when the inputs are binary (with pixel values
of 0 or 1).
+@item medianvalue
+Median of non-blank elements in first operand with the same type.
+Its usage is similar to @command{minvalue}, for example
@example
-$ astarithmetic image1.fits image2.fits -g1 and
+$ astarithmetic image.fits medianvalue -q
@end example
-For example, if you only want to see which pixels in an image have a value
@emph{between} 50 (greater equal, or inclusive) and 200 (less than, or
exclusive), you can use this command:
-@example
-$ astarithmetic image.fits set-i i 50 ge i 200 lt and
-@end example
+@item unique
+Remove all duplicate (and blank) elements from the first popped operand.
+The unique elements of the dataset will be stored in a single-dimensional
dataset.
-@item or
-Logical OR: returns 1 if either one of the operands is non-zero and 0 only
when both operators are zero.
-Both operands have to be the same kind: either both images or both numbers.
-The usage is similar to @code{and}.
+Recall that by default, single-dimensional datasets are stored as a table
column in the output.
+But you can use @option{--onedasimage} or @option{--onedonstdout} to
respectively store them as a single-dimensional FITS array/image, or to print
them on the standard output.
-For example, if you only want to see which pixels in an image have a value
@emph{outside of} -100 (greater equal, or inclusive) and 200 (less than, or
exclusive), you can use this command:
+Although you can use this operator on the floating point dataset, due to
floating-point errors it may give non-reasonable values: because the tenth
digit of the decimal point is also considered although it may be statistically
meaningless, see @ref{Numeric data types}.
+It is therefore better/recommended to use it on the integer dataset like the
labeled images of @ref{Segment output} where each pixel has the integer label
of the object/clump it is associated with.
+For example, let's assume you have cropped a region of a larger labeled image
and want to find the labels/objects that are within the crop.
+With this operator, this job is trivial:
@example
-$ astarithmetic image.fits set-i i -100 lt i 200 ge or
+$ astarithmetic seg-crop.fits unique
@end example
-@item not
-Logical NOT: returns 1 when the operand is 0 and 0 when the operand is
non-zero.
-The operand can be an image or number, for an image, it is applied to each
pixel separately.
-For example, if you want to know which pixels are not blank (and assuming that
we didn't have the @code{isnotblank} operator), you can use this @code{not}
operator on the output of the @command{isblank} operator described below:
-@example
-$ astarithmetic image.fits isblank not
-@end example
+@item noblank
+Remove all blank elements from the first popped operand.
+Since the blank pixels are being removed, the output dataset will always be
single-dimensional, independent of the dimensionality of the input.
+
+Recall that by default, single-dimensional datasets are stored as a table
column in the output.
+But you can use @option{--onedasimage} or @option{--onedonstdout} to
respectively store them as a single-dimensional FITS array/image, or to print
them on the standard output.
+
+For example, with the command below, the non-blank pixel values of
@file{cropped.fits} are printed on the command-line (the @option{--quiet}
option is used to remove the extra information that Arithmetic prints as it
reads the inputs, its version and its running time).
-@cindex Blank pixel
-@item isblank
-Test each pixel for being a blank value (see @ref{Blank pixels}).
-This is a conditional operator: the output has the same size and dimensions as
the input, but has an unsigned 8-bit integer type with two possible values:
either 1 (for a pixel that was blank) or 0 (for a pixel that was not blank).
-See the description of @code{lt} operator above).
-The difference is that it only needs one operand.
-For example:
@example
-$ astarithmetic image.fits isblank
+$ astarithmetic cropped.fits noblank --onedonstdout --quiet
@end example
-Because of the definition of a blank pixel, a blank value is not even equal to
itself, so you cannot use the equal operator above to select blank pixels.
-See the ``Blank pixels'' box below for more on Blank pixels in Arithmetic.
-In case you want to set non-blank pixels to an output pixel value of 1, it is
better to use @code{isnotblank} instead of `@code{isblank not}' (for more, see
the description of @code{isnotblank}).
+@end table
-@item isnotblank
-The inverse of the @code{isblank} operator above (see that description for
more).
-Therefore, if a pixel has a blank value, the output of this operator will have
a 0 value for it.
-This operator is therefore similar to running `@option{isblank not}', but
slightly more efficient (won't need the intermediate product of two operators).
+@node Stacking operators, Filtering operators, Statistical operators,
Arithmetic operators
+@subsubsection Stacking operators
-@item where
-Change the input (pixel) value @emph{where}/if a certain condition holds.
-The conditional operators above can be used to define the condition.
-Three operands are required for @command{where}.
-The input format is demonstrated in this simplified example:
+@cindex Stacking
+@cindex Coaddition
+The operators in this section are used when you have multiple datasets that
you would like to merge into one, commonly known as ``stacking'' or
``coaddition''.
+For example, you have taken ten exposures of your scientific target, and you
would like to combine them all into one deep stacked image that is deeper.
-@example
-$ astarithmetic modify.fits binary.fits if-true.fits where
-@end example
+When calling these operators you should determine how many operands they
should take in (unlike the rest of the operators that have a fixed number of
input operands).
+As described in the first operand below, you do this through their first
popped operand (which should be a single integer number that is larger than
one).
-The value of any pixel in @file{modify.fits} that corresponds to a non-zero
@emph{and} non-blank pixel of @file{binary.fits} will be changed to the value
of the same pixel in @file{if-true.fits} (this may also be a number).
-The 3rd and 2nd popped operands (@file{modify.fits} and @file{binary.fits}
respectively, see @ref{Reverse polish notation}) have to have the same
dimensions/size.
-@file{if-true.fits} can be either a number, or have the same dimension/size as
the other two.
+@table @command
-The 2nd popped operand (@file{binary.fits}) has to have @code{uint8} (or
@code{unsigned char} in standard C) type (see @ref{Numeric data types}).
-It is treated as a binary dataset (with only two values: zero and non-zero,
hence the name @code{binary.fits} in this example).
-However, commonly you will not be dealing with an actual FITS file of a
condition/binary image.
-You will probably define the condition in the same run based on some other
reference image and use the conditional and logical operators above to make a
true/false (or one/zero) image for you internally.
-For example, the case below:
+@cindex NaN
+@item min
+For each pixel, find the minimum value in all given datasets.
+The output will have the same type as the input.
-@example
-$ astarithmetic in.fits reference.fits 100 gt new.fits where
-@end example
+The first popped operand to this operator must be a positive integer number
which specifies how many further operands should be popped from the stack.
+All the subsequently popped operands must have the same type and size.
+This operator (and all the variable-operand operators similar to it that are
discussed below) will work in multi-threaded mode unless Arithmetic is called
with the @option{--numthreads=1} option, see @ref{Multi-threaded operations}.
-In the example above, any of the @file{in.fits} pixels that has a value in
@file{reference.fits} greater than @command{100}, will be replaced with the
corresponding pixel in @file{new.fits}.
-Effectively the @code{reference.fits 100 gt} part created the condition/binary
image which was added to the stack (in memory) and later used by @code{where}.
-The command above is thus equivalent to these two commands:
+Each pixel of the output of the @code{min} operator will be given the minimum
value of the same pixel from all the popped operands/images.
+For example, the following command will produce an image with the same size
and type as the three inputs, but each output pixel value will be the minimum
of the same pixel's values in all three input images.
@example
-$ astarithmetic reference.fits 100 gt --output=binary.fits
-$ astarithmetic in.fits binary.fits new.fits where
+$ astarithmetic a.fits b.fits c.fits 3 min --output=min.fits
@end example
-Finally, the input operands are read and used independently, so you can use
the same file more than once as any of the operands.
+Important notes:
+@itemize
-When the 1st popped operand to @code{where} (@file{if-true.fits}) is a single
number, it may be a NaN value (or any blank value, depending on its type) like
the example below (see @ref{Blank pixels}).
-When the number is blank, it will be converted to the blank value of the type
of the 3rd popped operand (@code{in.fits}).
-Hence, in the example below, all the pixels in @file{reference.fits} that have
a value greater than 100, will become blank in the natural data type of
@file{in.fits} (even though NaN values are only defined for floating point
types).
+@item
+NaN/blank pixels will be ignored, see @ref{Blank pixels}.
-@example
-$ astarithmetic in.fits reference.fits 100 gt nan where
-@end example
-@end table
+@item
+The output will have the same type as the inputs.
+This is natural for the @command{min} and @command{max} operators, but for
other similar operators (for example, @command{sum}, or @command{average}) the
per-pixel operations will be done in double precision floating point and then
stored back in the input type.
+Therefore, if the input was an integer, C's internal type conversion will be
used.
-@node Mathematical morphology operators, Bitwise operators, Conditional
operators, Arithmetic operators
-@subsubsection Mathematical morphology operators
+@item
+The operation will be multi-threaded, greatly speeding up the process if you
have large and numerous data to stack.
+You can disable multi-threaded operations with the @option{--numthreads=1}
option (see @ref{Multi-threaded operations}).
+@end itemize
-@cindex Mathematical morphology
-From Wikipedia: ``Mathematical morphology (MM) is a theory and technique for
the analysis and processing of geometrical structures, based on set theory,
lattice theory, topology, and random functions. MM is most commonly applied to
digital images''.
-In theory it extends a very large body of research and methods in image
processing, but currently in Gnuastro it mainly applies to images that are
binary (only have a value of 0 or 1).
-For example, you have applied the greater-than operator (@code{gt}, see
@ref{Conditional operators}) to select all pixels in your image that are larger
than a value of 100.
-But they will all have a value of 1, and you want to separate the various
groups of pixels that are connected (for example, peaks of stars in your image).
-With the @code{connected-components} operator, you can give each connected
region of the output of @code{gt} a separate integer label.
+@item max
+For each pixel, find the maximum value in all given datasets.
+The output will have the same type as the input.
+This operator is called similar to the @command{min} operator, please see
there for more.
+For example
+@example
+$ astarithmetic a.fits b.fits c.fits 3 max -omax.fits
+@end example
-@table @command
-@item erode
-@cindex Erosion
-Erode the foreground pixels (with value @code{1}) of the input dataset (second
popped operand).
-The first popped operand is the connectivity (see description in
@command{connected-components}).
-Erosion is simply a flipping of all foreground pixels (with value @code{1}) to
background (with value @code{0}) that are ``touching'' background pixels.
-``Touching'' is defined by the connectivity.
-In effect, this operator ``carves off'' the outer borders of the foreground,
making them thinner.
-This operator assumes a binary dataset (all pixels are @code{0} or @code{1}).
-For example, imagine that you have an astronomical image with a mean/sky value
of 0 units and a standard deviation (@mymath{\sigma}) of 100 units and many
galaxies in it.
-With the first command below, you can apply a threshold of @mymath{2\sigma} on
the image (by only keeping pixels that are greater than 200 using the
@command{gt} operator).
-The output of thresholding the image is a binary image (each pixel is either
smaller or equal to the threshold or larger than it).
-You can then erode the binary image with the second command below to remove
very small false positives (one or two pixel peaks).
+@item number
+For each pixel count the number of non-blank pixels in all given datasets.
+The output will be an unsigned 32-bit integer datatype (see @ref{Numeric data
types}).
+This operator is called similar to the @command{min} operator, please see
there for more.
+For example
@example
-$ astarithmetic image.fits 100 gt -obinary.fits
-$ astarithmetic binary.fits 2 erode -oout.fits
+$ astarithmetic a.fits b.fits c.fits 3 number -onum.fits
@end example
-In fact, you can merge these operations into one command thanks to the reverse
polish notation (see @ref{Reverse polish notation}):
+Some datasets may have blank values (which are also ignored in all similar
operators like @command{min}, @command{sum}, @command{mean} or
@command{median}).
+Hence, the final pixel values of this operator will not, in general, be equal
to the number of inputs.
+This operator is therefore mostly called in parallel with those operators to
know the ``weight'' of each pixel (in case you want to only keep pixels that
had the full exposure for example).
+
+@item sum
+For each pixel, calculate the sum in all given datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{min} operator, please see
there for more.
+For example
@example
-$ astarithmetic image.fits 100 gt 2 erode -oout.fits
+$ astarithmetic a.fits b.fits c.fits 3 sum -ostack-sum.fits
@end example
-To see the effect of connectivity, try this:
+@item mean
+For each pixel, calculate the mean in all given datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{min} operator, please see
there for more.
+For example
@example
-$ astarithmetic image.fits 100 gt 1 erode -oout-con-1.fits
+$ astarithmetic a.fits b.fits c.fits 3 mean -ocoadd-mean.fits
@end example
-@item dilate
-@cindex Dilation
-Dilate the foreground pixels (with value @code{1}) of the binary input dataset
(second popped operand).
-The first popped operand is the connectivity (see description in
@command{connected-components}).
-Dilation is simply a flipping of all background pixels (with value @code{0})
to foreground (with value @code{1}) that are ``touching'' foreground pixels.
-``Touching'' is defined by the connectivity.
-In effect, this expands the outer borders of the foreground.
-This operator assumes a binary dataset (all pixels are @code{0} and @code{1}).
-The usage is similar to @code{erode}, for example:
+@item std
+For each pixel, find the standard deviation in all given datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{min} operator, please see
there for more.
+For example
@example
-$ astarithmetic binary.fits 2 dilate -oout.fits
+$ astarithmetic a.fits b.fits c.fits 3 std -ostd.fits
@end example
-@item number-neighbors
-Return a dataset of the same size as the second popped operand, but where each
non-zero and non-blank input pixel is replaced with the number of its non-zero
and non-blank neighbors.
-The first popped operand is the connectivity (see above) and must be a
single-value of an integer type.
-The dataset is assumed to be binary (having an unsigned, 8-bit dataset).
+@item median
+For each pixel, find the median in all given datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{min} operator, please see
there for more.
+For example
+@example
+$ astarithmetic a.fits b.fits c.fits 3 median \
+ --output=stack-median.fits
+@end example
-For example with the command below, you can select all pixels above a value of
100 in your image with the ``greater-than'' or @code{gt} operator (see
@ref{Conditional operators}).
-Recall that the output of all conditional operators is a binary output (having
a value of 0 or 1).
-In the same command, we will then find how many neighboring pixels of each
pixel (that was originally above the threshold) are also above the threshold.
+@item quantile
+For each pixel, find the quantile from all given datasets.
+The output will have the same numeric data type and size as the input datasets.
+Besides the input datasets, the quantile operator also needs a single
parameter (the requested quantile).
+The parameter should be the first popped operand, with a value between (and
including) 0 and 1.
+The second popped operand must be the number of datasets to use.
+
+In the example below, the first-popped operand (@command{0.7}) is the
quantile, the second-popped operand (@command{3}) is the number of datasets to
pop.
@example
-$ astarithmetic image.fits 100 gt 2 number-neighbors
+astarithmetic a.fits b.fits c.fits 3 0.7 quantile
@end example
-@item connected-components
-@cindex Connected components
-Find the connected components in the input dataset (second popped operand).
-The first popped is the connectivity used in the connected components
algorithm.
-The second popped operand is the dataset where connected components are to be
found.
-It is assumed to be a binary image (with values of 0 or 1).
-It must have an 8-bit unsigned integer type which is the format produced by
conditional operators.
-This operator will return a labeled dataset where the non-zero pixels in the
input will be labeled with a counter (starting from 1).
+@item sigclip-number
+For each pixel, find the sigma-clipped number (after removing outliers) in all
given datasets.
+The output will have the an unsigned 32-bit integer type (see @ref{Numeric
data types}).
-The connectivity is a number between 1 and the number of dimensions in the
dataset (inclusive).
-1 corresponds to the weakest (symmetric) connectivity between elements and the
number of dimensions the strongest.
-For example, on a 2D image, a connectivity of 1 corresponds to 4-connected
neighbors and 2 corresponds to 8-connected neighbors.
+This operator will combine the specified number of inputs into a single output
that contains the number of remaining elements after @mymath{\sigma}-clipping
on each element/pixel (for more on @mymath{\sigma}-clipping, see @ref{Sigma
clipping}).
+This operator is very similar to @command{min}, with the exception that it
expects two operands (parameters for sigma-clipping) before the total number of
inputs.
+The first popped operand is the termination criteria and the second is the
multiple of @mymath{\sigma}.
-One example usage of this operator can be the identification of regions above
a certain threshold, as in the command below.
-With this command, Arithmetic will first separate all pixels greater than 100
into a binary image (where pixels with a value of 1 are above that value).
-Afterwards, it will label all those that are connected.
+For example, in the command below, the first popped operand (@command{0.2}) is
the sigma clipping termination criteria.
+If the termination criteria is larger than, or equal to, 1 it is interpreted
as the number of clips to do.
+But if it is between 0 and 1, then it is the tolerance level on the standard
deviation (see @ref{Sigma clipping}).
+The second popped operand (@command{5}) is the multiple of sigma to use in
sigma-clipping.
+The third popped operand (@command{10}) is number of datasets that will be
used (similar to the first popped operand to @command{min}).
@example
-$ astarithmetic in.fits 100 gt 2 connected-components
+astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-number
@end example
-If your input dataset does not have a binary type, but you know all its values
are 0 or 1, you can use the @code{uint8} operator (below) to convert it to
binary.
-
-@item fill-holes
-Flip background (0) pixels surrounded by foreground (1) in a binary dataset.
-This operator takes two operands (similar to @code{connected-components}): the
second is the binary (0 or 1 valued) dataset to fill holes in and the first
popped operand is the connectivity (to define a hole).
-Imagine that in your dataset there are some holes with zero value inside the
objects with one value (for example, the output of the thresholding example of
@command{erode}) and you want to fill the holes:
+@item sigclip-median
+For each pixel, find the sigma-clipped median in all given datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{sigclip-number} operator,
please see there for more.
+For example
@example
-$ astarithmetic binary.fits 2 fill-holes
+astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-median
@end example
-@item invert
-Invert an unsigned integer dataset (will not work on other data types, see
@ref{Numeric data types}).
-This is the only operator that ignores blank values (which are set to be the
maximum values in the unsigned integer types).
+@item sigclip-mean
+For each pixel, find the sigma-clipped mean in all given datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{sigclip-number} operator,
please see there for more.
+For example
+@example
+astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-mean
+@end example
-This is useful in cases where the target(s) has(have) been imaged in
absorption as raw formats (which are unsigned integer types).
-With this option, the maximum value for the given type will be subtracted from
each pixel value, thus ``inverting'' the image, so the target(s) can be treated
as emission.
-This can be useful when the higher-level analysis methods/tools only work on
emission (positive skew in the noise, not negative).
+@item sigclip-std
+For each pixel, find the sigma-clipped standard deviation in all given
datasets.
+The output will have the a single-precision (32-bit) floating point type.
+This operator is called similar to the @command{sigclip-number} operator,
please see there for more.
+For example
@example
-$ astarithmetic image.fits invert
+astarithmetic a.fits b.fits c.fits 3 5 0.2 sigclip-std
@end example
@end table
-@node Bitwise operators, Numerical type conversion operators, Mathematical
morphology operators, Arithmetic operators
-@subsubsection Bitwise operators
-
-@cindex Bitwise operators
-Astronomical images are usually stored as an array multi-byte pixels with
different sizes for different precision levels (see @ref{Numeric data types}).
-For example, images from CCDs are usually in the unsigned 16-bit integer type
(each pixel takes 16 bits, or 2 bytes, of memory) and fully reduced deep images
have a 32-bit floating point type (each pixel takes 32 bits or 4 bytes).
-
-On the other hand, during the data reduction, we need to preserve a lot of
meta-data about some pixels.
-For example, if a cosmic ray had hit the pixel during the exposure, or if the
pixel was saturated, or is known to have a problem, or if the optical
vignetting is too strong on it.
-A crude solution is to make a new image when checking for each one of these
things and make a binary image where we flag (set to 1) pixels that satisfy any
of these conditions above, and set the rest to zero.
-However, processing pipelines sometimes need more than 20 flags to store
important per-pixel meta-data, and recall that the smallest numeric data type
is one byte (or 8 bits, that can store up to 256 different values), while we
only need two values for each flag!
-This is a major waste of storage space!
-
-@cindex Flag (mask) images
-@cindex Mask (flag) images
-A much more optimal solution is to use the bits within each pixel to store
different flags!
-In other words, if you have an 8-bit pixel, use each bit as a flag to mark if
a certain condition has happened on a certain pixel or not.
-For example, let's set the following standard based on the four cases
mentioned above: the first bit will show that a cosmic ray has hit that pixel.
-So if a pixel is only affected by cosmic rays, it will have this sequence of
bits (note that the bit-counting starts from the right): @code{00000001}.
-The second bit shows that the pixel was saturated (@code{00000010}), the third
bit shows that it has known problems (@code{00000100}) and the fourth bit shows
that it was affected by vignetting (@code{00001000}).
-
-Since each bit is independent, we can thus mark multiple metadata about that
pixel in the actual image, within a single ``flag'' or ``mask'' pixel of a flag
or mask image that has the same number of pixels.
-For example, a flag-pixel with the following bits @code{00001001} shows that
it has been affected by cosmic rays @emph{and} it has been affected by
vignetting at the same time.
-The common data type to store these flagging pixels are unsigned integer types
(see @ref{Numeric data types}).
-Therefore when you open an unsigned 8-bit flag image in a viewer like DS9, you
will see a single integer in each pixel that actually has 8 layers of metadata
in it!
-For example, the integer you will see for the bit sequences given above will
respectively be: @mymath{2^0=1} (for a pixel that only has cosmic ray),
@mymath{2^1=2} (for a pixel that was only saturated), @mymath{2^2=4} (for a
pixel that only has known problems), @mymath{2^3=8} (for a pixel that is only
affected by vignetting) and @mymath{2^0 + 2^3 = 9} (for a pixel that has a
cosmic ray @emph{and} was affected by vignetting).
-
-You can later use this bit information to mark objects in your final analysis
or to mask certain pixels.
-For example, you may want to set all pixels affected by vignetting to NaN, but
can interpolate over cosmic rays.
-You therefore need ways to separate the pixels with a desired flag(s) from the
rest.
-It is possible to treat a flag pixel as a single integer (and try to define
certain ranges in value to select certain flags).
-But a much more easier and robust way is to actually look at each pixel as a
sequence of bits (not as a single integer!) and use the bitwise operators below
for this job.
-For more on the theory behind bitwise operators, see
@url{https://en.wikipedia.org/wiki/Bitwise_operation, Wikipedia}.
+@node Filtering operators, Pooling operators, Stacking operators, Arithmetic
operators
+@subsubsection Filtering (smoothing) operators
+Image filtering is commonly used for smoothing: every pixel value in the
output image is created by applying a certain statistic to the pixels in its
vicinity.
@table @command
-@item bitand
-Bitwise AND operator: only bits with values of 1 in both popped operands will
get the value of 1, the rest will be set to 0.
-For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 00100010 bitand} will give @code{00100000}.
-Note that the bitwise operators only work on integer type datasets.
-
-@item bitor
-Bitwise inclusive OR operator: The bits where at least one of the two popped
operands has a 1 value get a value of 1, the others 0.
-For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 00100010 bitand} will give @code{00101010}.
-Note that the bitwise operators only work on integer type datasets.
-
-@item bitxor
-Bitwise exclusive OR operator: A bit will be 1 if it differs between the two
popped operands.
-For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 00100010 bitand} will give @code{00001010}.
-Note that the bitwise operators only work on integer type datasets.
-
-@item lshift
-Bitwise left shift operator: shift all the bits of the first operand to the
left by a number of times given by the second operand.
-For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 2 lshift} will give @code{10100000}.
-This is equivalent to multiplication by 4.
-Note that the bitwise operators only work on integer type datasets.
-
-@item rshift
-Bitwise right shift operator: shift all the bits of the first operand to the
right by a number of times given by the second operand.
-For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 2 rshift} will give @code{00001010}.
-Note that the bitwise operators only work on integer type datasets.
-
-@item bitnot
-Bitwise not (more formally known as one's complement) operator: flip all the
bits of the popped operand (note that this is the only unary, or single
operand, bitwise operator).
-In other words, any bit with a value of @code{0} is changed to @code{1} and
vice-versa.
-For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 bitnot} will give @code{11010111}.
-Note that the bitwise operators only work on integer type datasets/numbers.
-@end table
-@node Numerical type conversion operators, Random number generators, Bitwise
operators, Arithmetic operators
-@subsubsection Numerical type conversion operators
+@item filter-mean
+Apply mean filtering (or @url{https://en.wikipedia.org/wiki/Moving_average,
moving average}) on the input dataset.
+During mean filtering, each pixel (data element) is replaced by the mean value
of all its surrounding pixels (excluding blank values).
+The number of surrounding pixels in each dimension (to calculate the mean) is
determined through the earlier operands that have been pushed onto the stack
prior to the input dataset.
+The number of necessary operands is determined by the dimensions of the input
dataset (first popped operand).
+The order of the dimensions on the command-line is the order in FITS format.
+Here is one example:
-With the operators below you can convert the numerical data type of your
input, see @ref{Numeric data types}.
-Type conversion is particularly useful when dealing with integers, see
@ref{Integer benefits and pitfalls}.
+@example
+$ astarithmetic 5 4 image.fits filter-mean
+@end example
-As an example, let's assume that your colleague gives you many single exposure
images for processing, but they have a double-precision floating point type!
-You know that the statistical error a single-exposure image can never exceed 6
or 7 significant digits, so you would prefer to archive them as a
single-precision floating point and save space on your computer (a
double-precision floating point is also double the file size!).
-You can do this with the @code{float32} operator described below.
+@noindent
+In this example, each pixel is replaced by the mean of a 5 by 4 box around it.
+The box is 5 pixels along the first FITS dimension (horizontal when viewed in
ds9) and 4 pixels along the second FITS dimension (vertical).
-@table @command
+Each pixel will be placed in the center of the box that the mean is calculated
on.
+If the given width along a dimension is even, then the center is assumed to be
between the pixels (not in the center of a pixel).
+When the pixel is close to the edge, the pixels of the box that fall outside
the image are ignored.
+Therefore, on the edge, less points will be used in calculating the mean.
-@item u8
-@itemx uint8
-Convert the type of the popped operand to 8-bit unsigned integer type (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+The final effect of mean filtering is to smooth the input image, it is
essentially a convolution with a kernel that has identical values for all its
pixels (is flat), see @ref{Convolution process}.
-@item i8
-@itemx int8
-Convert the type of the popped operand to 8-bit signed integer type (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+Note that blank pixels will also be affected by this operator: if there are
any non-blank elements in the box surrounding a blank pixel, in the filtered
image, it will have the mean of the non-blank elements, therefore it will not
be blank any more.
+If blank elements are important for your analysis, you can use the
@code{isblank} operator with the @code{where} operator to set them back to
blank after filtering.
-@item u16
-@itemx uint16
-Convert the type of the popped operand to 16-bit unsigned integer type (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+For example in the command below, we are first filtering the image, then
setting its original blank elements back to blank in the output of filtering
(all within one Arithmetic command).
+Note how we are using the @code{set-} operator to give names to the temporary
outputs of steps and simplify the code (see @ref{Operand storage in memory or a
file}).
-@item i16
-@itemx int16
-Convert the type of the popped operand to 16-bit signed integer (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+@example
+$ astarithmetic image.fits -h1 set-in \
+ 5 4 in filter-mean set-filtered \
+ filtered in isblank nan where \
+ --output=out.fits
+@end example
-@item u32
-@itemx uint32
-Convert the type of the popped operand to 32-bit unsigned integer type (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+@item filter-median
+Apply @url{https://en.wikipedia.org/wiki/Median_filter, median filtering} on
the input dataset.
+This is very similar to @command{filter-mean}, except that instead of the mean
value of the box pixels, the median value is used to replace a pixel value.
+For more on how to use this operator, please see @command{filter-mean}.
-@item i32
-@itemx int32
-Convert the type of the popped operand to 32-bit signed integer type (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+The median is less susceptible to outliers compared to the mean.
+As a result, after median filtering, the pixel values will be more
discontinuous than mean filtering.
-@item u64
-@itemx uint64
-Convert the type of the popped operand to 64-bit unsigned integer (see
@ref{Numeric data types}).
-The internal conversion of C will be used.
+@item filter-sigclip-mean
+Apply a @mymath{\sigma}-clipped mean filtering onto the input dataset.
+This is very similar to @code{filter-mean}, except that all outliers
(identified by the @mymath{\sigma}-clipping algorithm) have been removed, see
@ref{Sigma clipping} for more on the basics of this algorithm.
+As described there, two extra input parameters are necessary for
@mymath{\sigma}-clipping: the multiple of @mymath{\sigma} and the termination
criteria.
+@code{filter-sigclip-mean} therefore needs to pop two other operands from the
stack after the dimensions of the box.
-@item f32
-@itemx float32
-Convert the type of the popped operand to 32-bit (single precision) floating
point (see @ref{Numeric data types}).
-The internal conversion of C will be used.
-For example, if @file{f64.fits} is a 64-bit floating point image, and you want
to store it as a 32-bit floating point image, you can use the command below
(the second command is to show that the output file consumes half the storage)
+For example, the line below uses the same box size as the example of
@code{filter-mean}.
+However, all elements in the box that are iteratively beyond @mymath{3\sigma}
of the distribution's median are removed from the final calculation of the mean
until the change in @mymath{\sigma} is less than @mymath{0.2}.
@example
-$ astarithmetic f64.fits float32 --output=f32.fits
-$ ls -lh f64.fits f32.fits
+$ astarithmetic 3 0.2 5 4 image.fits filter-sigclip-mean
@end example
-@item f64
-@itemx float64
-Convert the type of the popped operand to 64-bit (double precision) floating
point (see @ref{Numeric data types}).
-The internal conversion of C will be used.
+The median (which needs a sorted dataset) is necessary for
@mymath{\sigma}-clipping, therefore @code{filter-sigclip-mean} can be
significantly slower than @code{filter-mean}.
+However, if there are strong outliers in the dataset that you want to ignore
(for example, emission lines on a spectrum when finding the continuum), this is
a much better solution.
+
+@item filter-sigclip-median
+Apply a @mymath{\sigma}-clipped median filtering onto the input dataset.
+This operator and its necessary operands are almost identical to
@code{filter-sigclip-mean}, except that after @mymath{\sigma}-clipping, the
median value (which is less affected by outliers than the mean) is added back
to the stack.
@end table
-@node Random number generators, Box shape operators, Numerical type conversion
operators, Arithmetic operators
-@subsubsection Random number generators
+@node Pooling operators, Interpolation operators, Filtering operators,
Arithmetic operators
+@subsubsection Pooling operators
-When you simulate data (for example, see @ref{Sufi simulates a detection}),
everything is ideal and there is no noise!
-The final step of the process is to add simulated noise to the data.
-The operators in this section are designed for that purpose.
+@cindex Pooling
+@cindex Convolutional Neural Networks
+Pooling is one way of reducing the complexity of the input image by grouping
multiple input pixels into one output pixel (using any statistical measure).
+As a result, the output image has fewer pixels (less complexity).
+In Computer Vision, Pooling is commonly used in
@url{https://en.wikipedia.org/wiki/Convolutional_neural_network, Convolutional
Neural Networks} (CNNs).
-@table @command
+In pooling, the inputs are an image (e.g., a FITS file) and a square window
pixel size (known as a pooling window).
+The window has to be smaller than the input's number of pixels in both
dimensions and its width is called the pool size.
+This window slides over all pixels in the input from the top-left corner to
the bottom-right corner (covering each input pixel only once).
+Currently the ``stride'' (or spacing between the windows as they slide over
the input) is equal to the window-size in Arithmetic.
+In other words, in pooling, the separate ``windows'' do not overlap with each
other on the input.
+Therefore there are two major differences with @ref{Spatial domain
convolution} or @ref{Filtering operators}, but pooling has some similarities to
the @ref{Warp}.
+@itemize
+@item
+In convolution or filtering the input and output sizes are the same.
+However, the output of pooling has fewer pixels.
+@item
+In convolution or filters, the kernels slide of the input in a pixel-by-pixel
manner.
+As a result, the same pixel's value will be used in many of the output pixels.
+However, in pooling each input pixel is only used in a single output pixel.
+@item
+Special cases of Warping an image are similar to pooling.
+For example calling @code{pool-sum} with pool size of 2 will give the same
pixel values (except the outer edges) as giving the same image to
@command{astwarp} with @option{--scale=1/2 --centeroncorner}.
+However, warping will only provide the sum of the input pixels, there is no
easy way to generically define something like @code{pool-max} in Warp (which is
far more general than pooling).
+Also, due to its generic features (for example for non-linear warps), Warp is
slower than the @code{pool-max} that is introduced here.
+@end itemize
-@item mknoise-sigma
-Add a fixed noise (Gaussian standard deviation) to each element of the input
dataset.
-This operator takes two arguments: the top/first popped operand is the noise
standard deviation, the next popped operand is the dataset that the noise
should be added to.
+@cartouche
+@noindent
+@strong{No WCS in output:} As of Gnuastro @value{VERSION}, the output of
pooling will not contain WCS information (primarily due to a lack of time by
developers).
+Please inform us of your interest in having it, by contacting us at
@command{bug-gnuastro@@gnu.org}.
+If you need @code{pool-sum}, you can use @ref{Warp} (which also modifies the
WCS, see note above).
+@end cartouche
-When @option{--quiet} is not given, a statement will be printed on each
invocation of this operator (if there are multiple calls to the
@code{mknoise-*}, the statement will be printed multiple times).
-It will show the random number generator function and seed that was used in
that invocation, see @ref{Generating random numbers}.
-Reproducibility of the outputs can be ensured with the @option{--envseed}
option, see below for more.
+If the width or height of input is not divisible by the pool size, the pool
window will go beyond the input pixel grid.
+In this case, the window pixels that do not overlap with the input are given a
blank value (and thus ignored in the calculation of the desired statistical
operation).
+
+The simple ASCII figure below shows the pooling operation where the input is a
@mymath{3\times3} pixel image with a pool size of 2 pixels.
+In the center of the second row, you see the intermediate input matrix that
highlights how the input and output pixels relate with each other.
+Since the input is @mymath{3\times3} and we have a pool size of 2, as
mentioned above blank pseudo-pixels are added with a value of @code{B} (for
blank).
-For example, with the first command below, @file{image.fits} will be degraded
by a noise of standard deviation 3 units.
@example
-$ astarithmetic image.fits 3 mknoise-sigma
+ Pool window: Input:
+ +-----------+ +-------------+
+ | | | | 10 12 9 |
+ | _ _ | _ _ |___________________________| 31 4 1 |
+ | | | || || | 16 5 8 |
+ | | | || || +-------------+
+ +-----------+ || ||
+ The pooling window 2*2 || ||
+ stride 2 \/ \/
+ +---------------------+
+ |/ 10 12\|/ 9 B \|
+ | | |
+ +-------+ pool-min |\ 31 4 /|\ 1 B /| pool-max +-------+
+ | 4 1 | /------ |---------------------| ------\ |31 9 |
+ | 5 8 | \------ |/ 16 5 \|/ 8 B \| ------/ |16 8 |
+ +-------+ | | | +-------+
+ |\ B B /.\ B B /|
+ +---------------------+
@end example
-Alternatively, you can use this operator within column arithmetic in the Table
program, to generate a random number like below (centered on 0, with
@mymath{\sigma=3}) like the first command below.
-With the second command, you can put it into a shell variable for later usage.
+The choice of the statistic to use depends on the specific use case, the
characteristics of the input data, and the desired output.
+Each statistic has its advantages and disadvantages and the choice of which to
use should be informed by the specific needs of the problem at hand.
+Below, the various pool operators of arithmetic are listed:
-@example
-$ echo 0 | asttable -c'arith $1 3 mknoise-sigma'
-$ value=$(echo 0 | asttable -c'arith $1 3 mknoise-sigma')
-$ echo $value
-@end example
+@table @command
-You can also use this operator in combination with AWK to easily generate an
arbitrarily large table with random columns.
-In the example below, we will create a two column table with 20 rows.
-The first column will be centered on 5 and @mymath{\sigma_1=2}, the second
will be centered on 10 and @mymath{\sigma_2=3}:
+@item pool-max
+Apply max-pooling on the input dataset.
+This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
+Within the pooling window, this operator will place the largest value in the
output pixel (any blank pixels will be ignored).
+
+See the ASCII diagram above for a demonstration of how max-pooling works.
+Here is an example of using this operator:
@example
-$ echo 5 10 \
- | awk '@{for(i=0;i<20;++i) print $1, $2@}' \
- | asttable -c'arith $1 2 mknoise-sigma' \
- -c'arith $2 3 mknoise-sigma'
+$ astarithmetic image.fits 2 pool-max
@end example
-By adding an extra @option{--output=random.fits}, the table will be saved into
a file called @file{random.fits}, and you can change the @code{i<20} to
@code{i<5000} to have 5000 rows instead.
-Of course, if your input table has different values in the desired column the
noisy distribution will be centered on each input element, but all will have
the same scatter/sigma.
-
-You can use the @option{--envseed} option to fix the random number generator
seed (and thus get a reproducible result).
-For more on @option{--envseed}, see @ref{Generating random numbers}.
-When using column arithmetic in Table, it may happen that multiple columns
need random numbers (with any of the @code{mknoise-*} operators) in one call of
@command{asttable}.
-In such cases, the value given to @code{GSL_RNG_SEED} is incremented by one on
every call to the @code{mknoise-*} operators.
-Without this increment, when the column values are the same (happens a lot,
for no-noised datasets), the returned values for all columns will be identical.
-But this feature has a side-effect: that if the order of calling the
@code{mknoise-*} operators changes, the seeds used for each operator will
change@footnote{We have defined @url{https://savannah.gnu.org/task/?15971, Task
15971} in Gnuastro's project management system to address this.
-If you need this feature please send us an email at
@code{bug-gnuastro@@gnu.org} (to motivate us in its implementation).}.
+Max-pooling retains the largest value of the input window in the output, so
the returned image is sharper where you have strong signal-to-noise ratio and
more noisy in regions with no significant signal (only noise).
+It is therefore useful when the background of the image is dark and we are
interested in only the highest signal-to-noise ratio regions of the image.
-In case each data element should have an independent sigma, the first popped
operand can be a dataset of the same size as the second.
-In this case, for each element, a different noise measure (for example, sigma
in @code{mknoise-sigma}) will be used.
+@item pool-min
+Apply min-pooling on the input dataset.
+This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
+Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
-@item mknoise-poisson
-@cindex Poisson noise
-Add Poisson noise to each element of the input dataset (see @ref{Photon
counting noise}).
-This operator takes two arguments: 1. the first popped operand (just before
the operator) is the @emph{per-pixel} background value (in units of electron
counts).
-2. The second popped operand is the dataset that the noise should be added to.
+Min-pooling is mostly used when the image has a high signal-to-noise ratio and
a light background: min-pooling will select darker (lower-valued) pixels.
+For low signal-to-noise regions, this operator will increase the noise level
(similar to the maximum, the scatter in the minimum is very strong).
-@cindex Dark night
-@cindex Gray night
-@cindex Nights (dark or gray)
-Recall that the background values reported by observatories (for example, to
define dark or gray nights), or in papers, is usually reported in units of
magnitudes per arcseconds square.
-You need to do the conversion to counts per pixel manually.
-The conversion of magnitudes to counts is described below.
-For converting arcseconds squared to number of pixels, you can use the
@option{--pixelscale} option of @ref{Fits}.
-For example, @code{astfits image.fits --pixelscale}.
+@item pool-sum
+Apply sum-pooling to the input dataset.
+This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
+Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
-Except for the noise-model, this operator is very similar to
@code{mknoise-sigma} and the examples there apply here too.
-The main difference with @code{mknoise-sigma} is that in a Poisson
distribution the scatter/sigma will depend on each element's value.
+Sum-pooling will increase the signal-to-noise ratio at the cost of having a
smoother output (less resolution).
-For example, let's assume you have made a mock image called @file{mock.fits}
with @ref{MakeProfiles} and it is assumed zero point is 22.5 (for more on the
zero point, see @ref{Brightness flux magnitude}).
-Let's assume the background level for the Poisson noise has a value of 19
magnitudes.
-You can first use the @code{mag-to-counts} operator to convert this background
magnitude into counts, then feed the background value in counts to
@code{mknoise-poisson} operator:
+@item pool-mean
+Apply mean pooling on the input dataset.
+This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
+Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
-@example
-$ astarithmetic mock.fits 19 22.5 mag-to-counts \
- mknoise-poisson
-@end example
+The mean pooling method smooths out the image and hence the sharp features may
not be identified when this pooling method is used.
+This therefore preserves more information than max-pooling, but may also
reduces the effect of the most prominent pixels.
+Mean is often used where a more accurate representation of the input is
required.
-Try changing the background value from 19 to 10 to see the effect!
-Recall that the tutorial @ref{Sufi simulates a detection} shows how you can
use MakeProfiles to build mock images.
+@item pool-median
+Apply median pooling on the input dataset.
+This operator takes two operands: the first popped operand is the width of the
square pooling window (which should be a single integer), and the second should
be the input image.
+Except the used statistical measurement, this operator is similar to
@code{pool-max}, see the description there for more.
-@item mknoise-uniform
-Add uniform noise to each element of the input dataset.
-This operator takes two arguments: the top/first popped operand is the width
of the interval, the second popped operand is the dataset that the noise should
be added to (each element will be the center of the interval).
-The returned random values may happen to be the minimum interval value, but
will never be the maximum.
-Except for the noise-model, this operator behaves very similar to
@code{mknoise-sigma}, see the explanation there for more.
+In general, the mean is mathematically easier to interpret and more
susceptible to outliers, while the median outputs as being less subject to the
influence of outliers compared to the mean so we have a smoother image.
+This is therefore better for low signal-to-ratio (noisy) features and extended
features (where you don't want a single high or low valued pixel to affect the
output).
+@end table
-For example, with the command below, a random value will be selected between
10 to 14 (centered on 12, which is the only input data element, with a total
width of 4).
+@node Interpolation operators, Dimensionality changing operators, Pooling
operators, Arithmetic operators
+@subsubsection Interpolation operators
-@example
-echo 12 | asttable -c'arith $1 4 mknoise-uniform'
-@end example
+Interpolation is the process of removing blank pixels from a dataset (by
giving them a value based on the non-blank neighbors).
-Similar to the example in @code{mknoise-sigma}, you can pipe the output of
@command{echo} to @command{awk} before passing it to @command{asttable} to
generate a full column of uniformly selected values within the same interval.
+@table @command
-@item random-from-hist-raw
-Generate random values from a custom distribution (defined by a histogram).
-The output will have a double-precision floating point type (see @ref{Numeric
data types}).
-This operator takes three operands:
-@itemize
-@item
-The first popped operand (nearest to the operator) is the histogram values.
-The histogram is a 1-dimensional dataset (a table column) and contains the
probability of obtaining a certain interval of values.
-The histogram does not have to be normalized: the GNU Scientific Library (or
GSL, which is used by Gnuastro for this operator), will normalize it internally.
-The value of each bin (whose probability is given in the histogram) is given
in the second popped operand.
-Therefore these two operands have to have the same number of rows.
-@item
-The second popped operand is the bin value (mostly the bin center, but it can
be anything).
-The probability of each bin is defined in the histogram operand (first popped
operand).
-The bins can have any width (do not have to be evenly spaced), and any order.
-Just make sure that the same row in the bins column corresponds to the same
row in the histogram: the number of rows in the bins and histogram must be
equal.
-@item
-The third popped operand is the dataset that the random values should be
written over.
-Effectively only its size will be used by this operator (all values will be
over-written as a double-precision floating point number).
-@end itemize
-The first two operands have to be single-dimensional (a table column) and have
the same number of rows, but the last popped operand can have any number of
dimensions.
-You can use the @code{load-col-} operator to load the two bins and histogram
columns from an external file (see @ref{Loading external columns}).
+@item interpolate-medianngb
+Interpolate the blank elements of the second popped operand with the median of
nearest non-blank neighbors to each.
+The number of the nearest non-blank neighbors used to calculate the median is
given by the first popped operand.
-For example, in the command below, we first construct a fake histogram to
represent a @mymath{y=x^2} distribution with AWK.
-We aim to distribute random values from this distribution in a
@mymath{100\times100} image.
-Therefore, we use the @command{makenew} operator to construct an empty image
of that size, use the @command{load-col-} operator to load the histogram
columns into Arithmetic and put the output in @file{random.fits}.
-Finally we visually inspect @file{random.fits} with DS9 and also have a look
at its pixel distribution with @command{aststatistics}.
+The distance of the nearest non-blank neighbors is irrelevant in this
interpolation.
+The neighbors of each blank pixel will be parsed in expanding circular rings
(for 2D images) or spherical surfaces (for 3D cube) and each non-blank element
over them is stored in memory.
+When the requested number of non-blank neighbors have been found, their median
is used to replace that blank element.
+For example, the line below replaces each blank element with the median of the
nearest 5 pixels.
@example
-$ echo "" | awk '@{for(i=1;i<5;++i) print i, i*i@}' \
- > histogram.txt
+$ astarithmetic image.fits 5 interpolate-medianngb
+@end example
-$ cat histogram.txt
-1 1
-2 4
-3 9
-4 16
+When you want to interpolate blank regions and you want each blank region to
have a fixed value (for example, the centers of saturated stars) this operator
is not good.
+Because the pixels used to interpolate various parts of the region differ.
+For such scenarios, you may use @code{interpolate-maxofregion} or
@code{interpolate-inofregion} (described below).
-$ astarithmetic 100 100 2 makenew \
- load-col-1-from-histogram.txt \
- load-col-2-from-histogram.txt \
- random-from-hist-raw \
- --output=random.fits
+@item interpolate-meanngb
+Similar to @code{interpolate-medianngb}, but will fill the blank values of the
dataset with the mean value of the requested number of nearest neighbors.
-$ astscript-fits-view random.fits
+@item interpolate-minngb
+Similar to @code{interpolate-medianngb}, but will fill the blank values of the
dataset with the minimum value of the requested number of nearest neighbors.
-$ aststatistics random.fits --asciihist --numasciibins=50
- | *
- | *
- | *
- | *
- | * *
- | * *
- | * *
- | * * *
- | * * *
- |* * * *
- |* * * *
- |--------------------------------------------------
-@end example
+@item interpolate-maxngb
+Similar to @code{interpolate-medianngb}, but will fill the blank values of the
dataset with the maximum value of the requested number of nearest neighbors.
+One useful implementation of this operator is to fill the saturated pixels of
stars in images.
-As you see, the 10000 pixels in the image only have values 1, 2, 3 or 4 (which
were the values in the bins column of @file{histogram.txt}), and the number of
times each of these values occurs follows the @mymath{y=x^2} distribution.
+@item interpolate-minofregion
+Interpolate all blank regions (consisting of many blank pixels that are
touching) in the second popped operand with the minimum value of the pixels
that are immediately bordering that region (a single value).
+The first popped operand is the connectivity (see description in
@command{connected-components}).
-Generally, any value given in the bins column will be used for the final
output values.
-For example, in the command below (for generating a histogram from an
analytical function), we are adding the bins by 20 (while keeping the same
probability distribution of @mymath{y=x^2}).
-If you re-run the Arithmetic command above after this, you will notice that
the pixels values are now one of the following 21, 22, 23 or 24 (instead of 1,
2, 3, or 4).
-But the shape of the histogram of the resulting random distribution will be
unchanged.
+For example, with the command below all the connected blank regions of
@file{image.fits} will be filled.
+Its an image (2D dataset), so a 2 connectivity means that the independent
blank regions are defined by 8-connected neighbors.
+If connectivity was 1, the regions would be defined by 4-connectivity: blank
regions that may only be touching on the corner of one pixel would be
identified as separate regions.
@example
-$ echo "" | awk '@{for(i=1;i<5;++i) print 20+i, i*i@}' \
- > histogram.txt
+$ astarithmetic image.fits 2 interpolate-minofregion
@end example
-If you do not want the outputs to have exactly the value of the bin
identifier, but be a randomly selected value from a uniform distribution within
the bin, you should use @command{random-from-hist} (see below).
-
-As mentioned above, the output will have a double-precision floating point
type (see @ref{Numeric data types}).
-Therefore, by default each element of the output will consume 8 bytes
(64-bits) of storage.
-This is usually far more than the statistical error/precision of your data
(and just results in wasted storage in your file system, or wasted RAM when a
program that uses the data is being run, and a slower running time of the
program).
+@item interpolate-maxofregion
+@cindex Saturated pixels
+Similar to @code{interpolate-minofregion}, but the maximum is used to fill the
blank regions.
-It is therefore recommended to use a type-conversion operator after this
operator to put the output in the smallest type that can be used to safely
store your data without wasting storage, RAM or time.
-For the list of type conversion operators, see @ref{Numerical type conversion
operators}.
-Recall that you already know the values returned by this operator (they are
one of the values in the bins column).
+This operator can be useful in filling saturated pixels in stars for example.
+Recall that the @option{interpolate-maxngb} operator looks for the maximum
value with a given number of neighboring pixels and is more useful in small
noisy regions.
+Therefore as the blank regions become larger, @option{interpolate-maxngb} can
cause a fragmentation in the connected blank region because the nearest
neighbor to one part of the blank region, may not fall within the pixels
searched for the other regions.
+With this option, the size of the blank region is irrelevant: all the pixels
bordering the blank region are parsed and their maximum value is used for the
whole region.
+@end table
-For example, in the example above, the whole image only has values 1, 2, 3 or
4.
-Since they are always positive and are below 255, we can safely place them in
an unsigned 8-bit integer (see @ref{Numeric data types}) with the command below
(note the @code{uint8} after the operator name, and that we are using a
different name for the output).
-After building the new image, let's have a look at the sizes of the two images
with @command{ls -l}:
+@node Dimensionality changing operators, Conditional operators, Interpolation
operators, Arithmetic operators
+@subsubsection Dimensionality changing operators
-@example
-$ astarithmetic 100 100 2 makenew \
- load-col-1-from-histogram.txt \
- load-col-2-from-histogram.txt \
- random-from-hist-raw uint8 \
- --output=random-u8.fits
+Through these operators you can change the dimensions of the output through
certain statistics on the dimensions that should be removed.
+For example, let's assume you have a 3D data cube that has 300 by 300 pixels
in the RA and Dec dimensions (first two dimensions), and 3600 slices along the
wavelength (third dimension), so the whole cube is
@mymath{300\times300\times3600} voxels (volume elements).
+To create a narrow-band image that only contains 100 slices around a certain
wavelength, you can crop that section (using @ref{Crop}), giving you a
@mymath{300\times300\times100} cube.
+You can now use the @code{collapse-sum} operator below to ``collapse'' all the
100 slices into one 2D image that has @mymath{300\times300} pixels.
+Every pixel in this 2D image will have the flux of the sum of the 100 slices.
-$ ls -lh random.fits random-u8.fits
--rw-r--r-- 1 name name 85K Jan 01 13:40 random.fits
--rw-r--r-- 1 name name 17K Jan 01 13:45 random-u8.fits
-@end example
+@table @command
-As you see, when using a suitable data type, we can shrink the size of the
file significantly without loosing any information (from 85 kilobytes to 17
kilobytes).
-This difference can be felt much better for larger (real-world) datasets, so
be sure to always set the output data type after calling this operator.
+@item to-1d
+Convert the input operand into a 1D array; irrespective of the number of
dimensions it has.
+This operator only takes a single operand (the input array) and just updates
the metadata.
+Therefore it does not change the layout of the array contents in memory and is
very fast.
+If no further operation is requested on the 1D array, recall that Arithmetic
will write a 1D array as a table column by default.
+In case you want the output to be saved as a 1D image, or to see it on the
standard output, please use the @code{--onedasimage} or @code{--onedonstdout}
options respectively (see @ref{Invoking astarithmetic}).
-@item random-from-hist
-Similar to @code{random-from-hist-raw}, but do not return the exact bin value,
instead return a random value from a uniform distribution within each bin.
-Therefore the following limitations have to be taken into account (compared to
@code{random-from-hist-raw}):
-@itemize
-@item
-The number associated with each bin (in the bin column) should be its center.
-@item
-The bins have to be in descending order (so the second row in the bin column
is larger than the first).
-@item
-The bin widths (distance from one bin to another) have to be fixed.
-@end itemize
+This operator is useful in scenarios where after some operations on a 2D image
or 3D cube, the dimensionality is no longer relevant for you and you just care
about the values.
+In the example below, we will first make a simple 2D image from a plain-text
file, then convert it to a 1D array:
-For a demonstration, let's replace @code{random-from-hist-raw} with
@code{random-from-hist} in the example of the description of
@code{random-from-hist-raw}.
-Note how we are manually converting the output of this operator into
single-precision floating point (32-bit, since the default 64-bit precision is
statistically meaningless in this scenario and we do not want to waste storage,
memory and running time):
@example
-$ echo "" | awk '@{for(i=1;i<5;++i) print i, i*i@}' \
- > histogram.txt
-
-$ astarithmetic 100 100 2 makenew \
- load-col-1-from-histogram.txt \
- load-col-2-from-histogram.txt \
- random-from-hist float32 \
- --output=random.fits
-
-$ aststatistics random.fits --asciihist --numasciibins=50
- | *
- | *** ********
- | ************
- | *************
- | * * *************
- | * ***********************
- | *************************
- | *************************
- | *************************************
- |********* * **************************************
- |**************************************************
- |--------------------------------------------------
-@end example
-
-You can see that the pixels of @file{histogram.fits} are no longer just 1, 2,
3 or 4.
-Instead, the values within each bin are selected from a uniform distribution
covering that bin.
-This creates the step-like feature in the histogram of the output.
-
-Of course, this extra uniform random number generation can make your program
slower so be sure to check if it is worth it.
-In particular, one way to avoid this (and use @command{random-from-hist-raw}
with a more contiguous-looking output distribution) is to simply use a
higher-resolution histogram (assuming it is possible: you have a sufficient
number of data points, or you have an analytical expression that you can sample
at smaller bin sizes).
+## Contents of 'a.txt' to start with.
+$ cat a.txt
+# Image 1: DEMO [counts, uint8] An example image
+1 2 3
+4 5 6
+7 8 9
-To better demonstrate this operator and its practical usage in everyday
research, let's look at another example:
-Assume you want to get 100 random star magnitudes that follow the real-world
Gaia Data release 3 magnitude distribution within a radius of 2 degrees around
the (RA,Dec) coordinate of (1.23,4.56).
-Let's further assume that you want to distribute them uniformly over an image
of size 1000 by 1000 pixels.
-So your desired output table should have three columns, the first two are
pixel positions of each star, and the third is the magnitude.
+## Convert the text image into a FITS image.
+$ astconvertt a.txt -o a.fits
-First, we need to query the Gaia database and ask for all the magnitudes in
this region of the sky.
-We know that Gaia is not complete for stars fainter than the 20th magnitude,
so we will use the @option{--range} option and only ask for those stars that
are brighter than magnitude 20.
+## Convert it into a table column (1D):
+$ astarithmetic a.fits to-1d -o table.fits
-@example
-$ astquery gaia --dataset=dr3 --center=1.23,3.45 --radius=2 \
- --column=phot_g_mean_mag --output=gaia.fits \
- --range=phot_g_mean_mag,-inf,20
+## Convert it into a 1D image:
+$ astarithmetic a.fits to-1d -o table.fits --onedasimage
@end example
-We now have more than 25000 magnitudes in @file{gaia.fits}!
-To get a more accurate random sampling of our stars, let's construct a
histogram with 500 bins, and generate our three desired randomly selected
columns:
+@cindex Flattening (CNNs)
+A more real-world example would be the following: assume you want to
``flatten'' two images into a single 1D array (as commonly done in
convolutional neural networks, or
CNNs@footnote{@url{https://en.wikipedia.org/wiki/Convolutional_neural_network}}).
+First, we show the contents of a new @mymath{2\times2} image in plain-text
image, then convert it to a 2D FITS image (@file{b.fits}).
+We will then use arithmetic to make both @file{a.fits} (from the example
above) and @file{b.fits} into a 1D array and stitch them together into a single
1D image with one call to Arithmetic.
+For a description of the @code{stitch} operator, see below (same section).
@example
-$ aststatistics gaia.fits --histogram --numbins=500 \
- --output=gaia-hist.fits
+## Contents of 'b.txt':
+$ cat b.txt
+# Image 1: DEMO [counts, uint8] An example image
+10 11
+12 13
-$ asttable gaia-hist.fits -i
+## Convert the text image into a FITS image.
+$ astconvertt b.txt -o b.fits
-$ echo 1000 \
- | awk '@{for(i=0;i<100;++i) print $1/2@}' \
- | asttable -c'arith $1 500 mknoise-uniform' \
- -c'arith $1 500 mknoise-uniform' \
- -c'arith $1 \
- load-col-1-from-gaia-hist.fits-hdu-1 \
- load-col-2-from-gaia-hist.fits-hdu-1 \
- random-from-hist float32'
+# Flatten the two images into a single 1D image:
+$ astarithmetic a.fits to-1d b.fits to-1d 2 1 stitch -g1 \
+ --onedonstdout --quiet
+1
+2
+3
+4
+5
+6
+7
+8
+9
+10
+11
+12
+13
@end example
-These columns can easily be placed in the format for @ref{MakeProfiles} to be
inserted into an image automatically.
-@end table
+@item stitch
+Stitch (connect) any number of given images together along the given dimension.
+The output has the same number of dimensions as the input, but the number of
pixels along the requested dimension will be different from the inputs.
+The @code{stitch} operator takes at least three operands:
+@itemize
+@item
+The first popped operand (placed just before @code{stitch}) is the direction
(dimension) that the images should be stitched along.
+The first FITS dimension is along the horizontal, therefore a value of
@code{1} will stitch them horizontally.
+Similarly, giving a value of @code{2} will result in a vertical stitch.
-@node Box shape operators, Loading external columns, Random number generators,
Arithmetic operators
-@subsubsection Box shape operators
+@item
+The second popped operand is the number of images that should be stitched.
-The operators here help you in defining or using coordinates that form a
``box'' (a rectangular region).
+@item
+Depending on the value given to the second popped operand, @code{stitch} will
pop the given number of datasets from the stack and stitch them along the given
dimension.
+The popped images have to have the same number of pixels along the other
dimension.
+The order of the stitching is defined by how they are placed in the
command-line, not how they are popped (after being popped, they are placed in a
list in the same order).
+@end itemize
-@table @command
-@item box-around-ellipse
-Return the width (along horizontal) and height (along vertical) of a box that
encompasses an ellipse with the same center point.
-The top-popped operand is assumed to be the position angle (angle from the
horizontal axis) in @emph{degrees}.
-The second and third popped operands are the minor and major radii of the
ellipse respectively.
-This operator outputs two operands on the general stack.
-The first one is the width and the second (which will be the top one when this
operator finishes) is the height.
+For example, in the commands below, we will first crop out fixed sized regions
of @mymath{100\times300} pixels of a larger image (@file{large.fits}) first.
+In the first call of Arithmetic below, we will stitch the bottom set of crops
together along the first (horizontal) axis.
+In the second Arithmetic call, we will stitch all 6 along both dimensions.
-If the value to the second popped operand (minor axis) is larger than the
third (major axis), a NaN value will be written for both the width and height
of that element and a warning will be printed (the warning can be disabled with
the @option{--quiet} option).
+@example
+## Crop the fixed-size regions of a larger image ('-O' is the
+## short form of the '--mode' option).
+$ astcrop large.fits -Oimg --section=1:100,1:300 -oa.fits
+$ astcrop large.fits -Oimg --section=101:200,1:300 -ob.fits
+$ astcrop large.fits -Oimg --section=201:300,1:300 -oc.fits
+$ astcrop large.fits -Oimg --section=1:100,301:600 -od.fits
+$ astcrop large.fits -Oimg --section=101:200,301:600 -oe.fits
+$ astcrop large.fits -Oimg --section=201:300,301:600 -of.fits
-As an example, if your ellipse has a major axis radius of 10 units, a minor
axis radius of 4 units and a position angle of 20 degrees, you can estimate the
bounding box with this command:
+## Stitch the bottom three crops into one image.
+$ astarithmetic a.fits b.fits c.fits 3 1 stitch -obottom.fits
-@example
-$ echo "10 4 20" \
- | asttable -c'arith $1 $2 $3 box-around-ellipse'
+# Stitch all the 6 crops along both dimensions
+$ astarithmetic a.fits b.fits c.fits 3 1 stitch \
+ d.fits e.fits f.fits 3 1 stitch \
+ 2 2 stitch -g1 -oall.fits
@end example
-Alternatively if your three values are in separate FITS arrays/images, you can
use the command below to have the width and height in similarly sized fits
arrays.
-In this example @file{a.fits} and @file{b.fits} are respectively the major and
minor axis lengths and @file{pa.fits} is the position angle (in degrees).
-Also, in all three, we assume the first extension is used.
-After it is done, the height of the box will be put in @file{h.fits} and the
width will be in @file{w.fits}.
-Just note that because this operator has two output datasets, you need to
first write the height (top output operand) into a file and free it with the
@code{tofilefree-} operator, then write the width in the file given to
@option{--output}.
+The start of the last command is like the one before it (stitching the bottom
three crops along the first FITS dimension, producing a @mymath{300\times300}
image).
+Later in the same command, we then stitch the top three crops horizontally
(again, into a @mymath{300\times300} image)
+This leaves the the two @mymath{300\times300} images on the stack (see
@ref{Reverse polish notation}).
+We finally stitch those two along the second (vertical) dimension.
+This operator is therefore useful in scenarios like placing the CCD amplifiers
into one image.
+
+@item trim
+Trim all blank elements from the outer edges of the input operand (it only
takes a single operand).
+For example see the commands below using Table's @ref{Column arithmetic}:
@example
-$ astarithmetic a.fits b.fits pa.fits box-around-ellipse \
- tofilefree-h.fits -ow.fits -g1
+$ cat table.txt
+nan
+nan
+nan
+3
+4
+nan
+5
+6
+nan
+
+$ asttable table.txt -Y -c'arith $1 trim'
+3.000000
+4.000000
+nan
+5.000000
+6.000000
@end example
-Finally, if you need to treat the width and height separately for further
processing, you can call the @code{set-} operator two times afterwards like
below.
-Recall that the @code{set-} operator will pop the top operand, and put it in
memory with a certain name, bringing the next operand to the top of the stack.
+Similarly, on 2D images or 3D cubes, all outer rows/columns or slices that are
fully blank get ``trim''ed with this operator.
+This is therefore a very useful operator for extracting a certain feature
within your dataset.
-For example, let's assume @file{catalog.fits} has at least three columns
@code{MAJOR}, @code{MINOR} and @code{PA} which specify the major axis, minor
axis and position angle respectively.
-But you want the final width and height in 32-bit floating point numbers (not
the default 64-bit, which may be too much precision in many scenarios).
-You can do this with the command below (note you can also break lines with
@key{\}, within the single-quote environment)
+For example, let's assume that you have set @ref{NoiseChisel} and
@ref{Segment} on an image to extract all clumps and objects.
+With the command below on Segment's output, you will have a smaller image that
only contains the sky-subtracted input pixels corresponding to object 263.
@example
-$ asttable catalog.fits \
- -c'arith MAJOR MINOR PA box-around-ellipse \
- set-height set-width \
- width float32 height float32'
+$ astarithmetic seg.fits -hINPUT seg.fits -hOBJECTS \
+ 263 ne nan where trim --output=obj-263.fits
@end example
-@item box-vertices-on-sphere
-@cindex Polygon
-@cindex Vertices on sphere (sky)
-Convert a box center and width to the coordinates of the vertices of the box
on a left-hand spherical coordinate system.
-In a left-handed spherical coordinate system, the longitude increases towards
the left while north is up (as in the RA and Dec direction of the equatorial
coordinate system used in astronomy).
-This operator therefore takes four input operands (the RA and Dec of the box's
center, as well as the width of the box in each direction).
+@item collapse-sum
+Collapse the given dataset (second popped operand), by summing all elements
along the first popped operand (a dimension in FITS standard: counting from
one, from fastest dimension).
+The returned dataset has one dimension less compared to the input.
-After it is complete, this operator places 8 operands on the stack which
contain the RA and Dec of the four vertices of the box in the following
anti-clockwise order:
-@enumerate
-@item
-Bottom-left vertice Longitude (RA)
-@item
-Bottom-left vertice Latitude (Dec)
-@item
-Bottom-right vertice Longitude (RA)
-@item
-Bottom-right vertice Latitude (Dec)
-@item
-Top-right vertice Longitude (RA)
-@item
-Top-right vertice Latitude (Dec)
-@item
-Top-left vertice Longitude (RA)
-@item
-Top-left vertice Latitude (Dec)
-@end enumerate
+The output will have a double-precision floating point type irrespective of
the input dataset's type.
+Doing the operation in double-precision (64-bit) floating point will help the
collapse (summation) be affected less by floating point errors.
+But afterwards, single-precision floating points are usually enough in real
(noisy) datasets.
+So depending on the type of the input and its nature, it is recommended to use
one of the type conversion operators on the returned dataset.
-For example, with the command below, we will retrieve the vertice coordinates
of a rectangle around a point with RA=20 and Dec=0 (on the equator).
-The rectangle will have a 1 degree edge along the RA direction and a 2 degree
edge along the declination.
-In this example, we are using the @option{-Afixed -B2} only for demonstration
purposes here due to the round numbers!
-In general, it is best to write your outputs to a binary FITS table to
preserve the full precision (see @ref{Printing floating point numbers}).
+@cindex World Coordinate System (WCS)
+If any WCS is present, the returned dataset will also lack the respective
dimension in its WCS matrix.
+Therefore, when the WCS is important for later processing, be sure that the
input is aligned with the respective axes: all non-diagonal elements in the WCS
matrix are zero.
+
+@cindex Data cubes
+@cindex 3D data-cubes
+@cindex Cubes (3D data)
+@cindex Narrow-band image
+@cindex IFU: Integral Field Unit
+@cindex Integral field unit (IFU)
+One common application of this operator is the creation of pseudo broad-band
or narrow-band 2D images from 3D data cubes.
+For example, integral field unit (IFU) data products that have two spatial
dimensions (first two FITS dimensions) and one spectral dimension (third FITS
dimension).
+The command below will collapse the whole third dimension into a 2D array the
size of the first two dimensions, and then convert the output to
single-precision floating point (as discussed above).
@example
-$ echo "20 0 1 2" \
- | asttable -Afixed -B2 \
- -c'arith $1 $2 $3 $4 box-vertices-on-sphere'
-20.50 -1.00 19.50 -1.00 19.50 1.00 20.50 1.00
+$ astarithmetic cube.fits 3 collapse-sum float32
@end example
-We see that the bottom-left vertice is at (RA,Dec) of @mymath{(20.50,-1.0)}
and the top-right vertice is at @mymath{(19.50,1.00)}.
-These could have easily been done by manually adding and subtracting!
-But you will see that the complexity arises at higher/lower declinations.
-For example, with the command below, let's see how vertice coordinates of the
same box, but after moving its center to (RA,Dec) of (20,85):
+@item collapse-mean
+Similar to @option{collapse-sum}, but the returned dataset will be the mean
value along the collapsed dimension, not the sum.
+
+@item collapse-number
+Similar to @option{collapse-sum}, but the returned dataset will be the number
of non-blank values along the collapsed dimension.
+The output will have a 32-bit signed integer type.
+If the input dataset does not have blank values, all the elements in the
returned dataset will have a single value (the length of the collapsed
dimension).
+Therefore this is mostly relevant when there are blank values in the dataset.
+
+@item collapse-min
+Similar to @option{collapse-sum}, but the returned dataset will have the same
numeric type as the input and will contain the minimum value for each pixel
along the collapsed dimension.
+
+@item collapse-max
+Similar to @option{collapse-sum}, but the returned dataset will have the same
numeric type as the input and will contain the maximum value for each pixel
along the collapsed dimension.
+
+@item collapse-median
+Similar to @option{collapse-sum}, but the returned dataset will have the same
numeric type as the input and will contain the median value for each pixel
along the collapsed dimension.
+
+The median involves sorting, therefore @code{collapse-median} will do each
calculation in different CPU threads to speed up the operation.
+By default, Arithmetic will detect and use all available threads, but you can
override this with the @option{--numthreads} (or @option{-N}) option.
+
+@item collapse-sigclip-mean
+Collapse the input dataset (fourth popped operand) along the FITS dimension
given as the first popped operand by calculating the sigma-clipped mean.
+The sigma-clipping parameters (namely, the multiple of sigma and termination
criteria) are read as the third and second popped operands respectively.
+For more on sigma-clipping, see @ref{Sigma clipping}.
+
+For example, with the command below, the pixels of the input 2 dimensional
@file{image.fits} will be collapsed to a single dimension output.
+The first popped operand is @code{2}, so it will collapse all the pixels that
are vertically on top of each other.
+Such that the output will have the same number of pixels as the horizontal
axis of the input.
+During the collapsing, all pixels that are more than @mymath{3\sigma} (third
popped operand) are rejected, and the clipping will continue until the standard
deviation changes less than @mymath{0.2} between clips.
@example
-$ echo "20 85 1 2" \
- | asttable -Afixed -B2 \
- -c'arith $1 $2 $3 $4 box-vertices-on-sphere'
-24.78 84.00 15.22 84.00 12.83 86.00 27.17 86.00
+$ astarithmetic image.fits 3 0.2 2 collapse-sigclip-mean \
+ --output=collapsed-vertical.fits
@end example
-Even though, we didn't change the central RA (20) or the size of the box
along the RA (1 degree), the RA of the bottom-left vertice is now at 24.78;
almost 5 degrees away!
-This occurs because of the spherical coordinate system, we measure the
longitude (e.g., RA) with the following way:
-@enumerate
-@item
-@cindex Meridian
-@cindex Great circle
-@cindex Circle (great)
-Draw a meridian that passes your point.
-The meridian is half of a @url{https://en.wikipedia.org/wiki/Great_circle,
great-circle} (which has a diameter that is equal to the sphere's diameter)
passes both poles.
-@item
-Find the intersection of that meridian with the equator.
-@item
-The distance of the intersection and the reference point (along the equator)
defines the longitude angle.
-@end enumerate
+@cartouche
+@noindent
+@strong{Printing output of collapse in plain-text:} the default datatype of
@code{collapse-sigclip-mean} is 32-bit floating point.
+This is sufficient for any observed astronomical data.
+However, if you request a plain-text output, or decide to print/view the
output as plain-text on the standard output, the full set of decimals may not
be printed in some situations.
+This can lead to apparently discrete values in the output of this operator
when viewed in plain-text!
+The FITS format is always superior (since it stores the value in binary,
therefore not having the problem above).
+But if you are forced to save the output in plain-text, use the @code{float64}
operator after this to change the type to 64-bit floating point (which will
print more decimals).
+@end cartouche
-@cindex Small circle
-@cindex Circle (small)
-As you get more distant from the equator (declination becomes non-zero), any
change along the RA (towards the east; 1 degree in the example above) will on
longer be on a great circle, but along a
``@url{https://en.wikipedia.org/wiki/Circle_of_a_sphere, small circle}''.
-On a small circle that is defined by the fixed declination @mymath{\delta},
the distance of two points is closer than the distances of their projection on
the equator (as described in the definition of longitude above).
-It is smaller by a factor of @mymath{\cos({\delta})}.
+@item collapse-sigclip-std
+Collapse the input dataset along the given FITS dimension by calculating the
sigma-clipped standard deviation.
+Except for returning the standard deviation after clipping, this function is
similar to @code{collapse-sigclip-mean}, see the description of that operator
for more.
-Therefore, an angular change (let's call it @mymath{\Delta_{lon}}) along the
small circle defined by the fixed declination of @mymath{\delta} corresponds to
@mymath{\Delta_{lon}/\cos(\delta)} on the equator.
-@end table
+@item collapse-sigclip-median
+Collapse the input dataset along the given FITS dimension by calculating the
sigma-clipped median.
+Except for returning the median after clipping, this function is similar to
@code{collapse-sigclip-mean}, see the description of that operator for more.
-@node Loading external columns, Size and position operators, Box shape
operators, Arithmetic operators
-@subsubsection Loading external columns
+@item collapse-sigclip-number
+Collapse the input dataset along the given FITS dimension by calculating the
number of elements that remain after sigma-clipped.
+Except for returning the number after clipping, this function is similar to
@code{collapse-sigclip-mean}, see the description of that operator for more.
-In the Arithmetic program, you can always load new dataset by simply giving
their name.
-However, they can only be images, not a column.
-In the Table program, you can load columns in @ref{Column arithmetic}, but it
has to be columns within the same table (and thus the same number of rows).
-However, in some situations, it is necessary to use certain columns of a table
in the Arithmetic program, or columns of different rows (from the main input)
in Table.
+@item add-dimension-slow
+Build a higher-dimensional dataset from all the input datasets stacked after
one another (along the slowest dimension).
+The first popped operand has to be a single number.
+It is used by the operator to know how many operands it should pop from the
stack (and the size of the output in the new dimension).
+The rest of the operands must have the same size and numerical data type.
+This operator currently only works for 2D input operands, please contact us if
you want inputs to have different dimensions.
-@table @command
-@item load-col-%-from-%
-@itemx load-col-%-from-%-hdu-%
-Load the requested column (first @command{%}) from the requested file (second
@command{%}).
-If the file is a FITS file, it is also necessary to specify a HDU using the
second form (where the HDU identifier is the third @command{%}.
-For example, @command{load-col-MAG-from-catalog.fits-hdu-1} will load the
@code{MAG} column from HDU 1 of @code{catalog.fits}.
+The output's WCS (which should have a different dimensionality compared to the
inputs) can be read from another file with the @option{--wcsfile} option.
+If no file is specified for the WCS, the first dataset's WCS will be used, you
can later add/change the necessary WCS keywords with the FITS keyword
modification features of the Fits program (see @ref{Fits}).
-For example, let's assume you have the following two tables, and you would
like to add the first column of the first with the second:
+If your datasets do not have the same type, you can use the type
transformation operators of Arithmetic that are discussed below.
+Just beware of overflow if you are transforming to a smaller type, see
@ref{Numeric data types}.
+
+For example, let's assume you have 3 two-dimensional images @file{a.fits},
@file{b.fits} and @file{c.fits} (each with @mymath{200\times100} pixels).
+You can construct a 3D data cube with @mymath{200\times100\times3} voxels
(volume-pixels) using the command below:
@example
-$ asttable tab-1.fits
-1 43.23
-2 21.91
-3 71.28
-4 18.10
+$ astarithmetic a.fits b.fits c.fits 3 add-dimension-slow
+@end example
-$ cat tab-2.txt
-5
-6
-7
-8
+@item add-dimension-fast
+Similar to @code{add-dimension-slow} but along the fastest dimension.
+This operator currently only works for 1D input operands, please contact us if
you want inputs to have different dimensions.
-$ asttable tab-1.txt -c'arith $1 load-col-1-from-tab-2.txt +'
-6
-8
-10
-12
-@end example
+For example, let's assume you have 3 one-dimensional datasets, each with 100
elements.
+With this operator, you can construct a @mymath{3\times100} pixel FITS image
that has 3 pixels along the horizontal and 5 pixels along the vertical.
@end table
-@node Size and position operators, Building new dataset and stack management,
Loading external columns, Arithmetic operators
-@subsubsection Size and position operators
+@node Conditional operators, Mathematical morphology operators, Dimensionality
changing operators, Arithmetic operators
+@subsubsection Conditional operators
-With the operators below you can get metadata about the top dataset on the
stack.
+Conditional operators take two inputs and return a binary output that can only
have two values 0 (for pixels where the condition was false) or 1 (for the
pixels where the condition was true).
+Because of the binary (2-valued) nature of their outputs, the output is
therefore stored in an @code{unsigned char} data type (see @ref{Numeric data
types}) to speed up process and take less space in your storage.
+There are two exceptions to the general features above: @code{isblank} only
takes one input, and @code{where} takes three, while not returning a binary
output, see their description for more.
-@table @code
-@item index
-Add a new operand to the stack with an integer type and the same size (in all
dimensions) as top operand on the stack (before it was called; it is not
popped!).
-The first pixel in the returned operand is zero, and every later pixel's value
is incremented by one.
-It is important to remember that the top operand is not popped by this
operand, so it remains on the stack.
-After this operand is finished, it adds a new operand to the stack.
-To pop the previous operand, you can use the @code{indexonly} operator.
+@table @command
+@item lt
+Less than: creates a binary output (values either 0 or 1) where each pixel
will be 1 if the second popped operand is smaller than the first popped operand
and 0 otherwise.
+If both operands are images, then all the pixels will be compared with their
counterparts in the other image.
-The data type of the output is always an unsigned integer, and its width is
determined from the number of pixels/rows in the top operand.
-For example if there are only 108 rows in a table, the returned column will
have an unsigned 8-bit integer type (that can keep 256 separate values).
-But if the top operand is a @mymath{1000\times1000=10^6} pixel image, the
output will be a 32-bit unsigned integer.
-For the various types of integers, see @ref{Numeric data types}.
+For example, the pixels in the output of the command below will have a value
of 1 (true) if their value in @file{image1.fits} is less than their value in
@file{image2.fits}.
+Otherwise, their value will be 0 (false).
+@example
+$ astarithmetic image1.fits image2.fits lt
+@end example
+If only one operand is an image, then all the pixels will be compared with the
single value (number) of the other operand.
+For example:
+@example
+$ astarithmetic image1.fits 1000 lt
+@end example
+Finally if both are numbers, then the output is also just one number (0 or 1).
+@example
+$ astarithmetic 4 5 lt
+@end example
-To see the index image along with the actual image, you can use the
@option{--writeall} operator to have a multi-HDU output (without
@option{--writeall}, Arithmetic will complain if more than one operand is left
at the end).
-After DS9 opens with the second command, flip between the two extensions.
+@item le
+Less or equal: similar to @code{lt} (`less than' operator), but returning 1
when the second popped operand is smaller or equal to the first.
+For example
@example
-$ astarithmetic image.fits index --writeall
-$ astscript-fits-view image_arith.fits
+$ astarithmetic image1.fits 1000 le
@end example
-Below is a review some usage examples of this operator:
-
-@table @asis
-@item Image: masking margins
-With the command below, we will be masking all pixels that are 20 pixels away
from the edges of the image (on the margin).
-Here is a description of the command below (for the basics of Arithmetic's
notation, see @ref{Reverse polish notation}):
-@itemize
-@item
-The @code{index} operator just adds a new dataset on the stack: unlike almost
all other operators in Arithmetic, @code{index} doesn't remove its input
dataset from the stack (use @code{indexonly} for the ``normal'' behavior).
-This is because @code{index} returns the pixel metadata not data.
-As a result, after @code{index}, we have two operands on the stack: the input
image and the index image.
-@item
-With the @code{set-i} operator, the top operand (the image containing the
index of each pixel) is popped from the stack and associated to the name
@code{i}.
-Therefore after this, the stack only has the input image.
-For more on the @code{set-} operator, see @ref{Operand storage in memory or a
file}.
-@item
-We need three values from the commands before Arithmetic (for the width and
height of the image and the size of the margin).
-To make the rest of the command easier to read/use, we'll define them in
Arithmetic as three named operators (respectively called @code{w}, @code{h} and
@code{m}).
-All three are integers that will have a positive value lower than
@mymath{2^{16}=65536} (for a ``normal'' image!).
-Therefore, we will store them as 16-bit unsigned integers with the
@code{uint16} operator (this will help optimal processing in later steps).
-For more the type changing operators, see @ref{Numerical type conversion
operators}.
-@item
-Using the modulo @code{%} and division (@code{/}) operators on the index image
and the width, we extract the horizontal (X) and vertical (Y) positions of each
pixel in separately named operands called @code{X} and @code{Y}.
-The maximum value in these two will also fit within an unsigned 16-bit
integer, so we'll also store these in that type.
-@item
-For the horizontal (X) dimension, we select pixels that are less than the
margin (@code{X m lt}) and those that are more than the width subtracted by the
margin (@code{X w m - gt}).
-@item
-The output of the @code{lt} and @code{gt} conditional operators above is a
binary (0 or 1 valued) image.
-We therefore merge them into one binary image using the @code{or} operator.
-For more, see @ref{Conditional operators}.
-@item
-We repeat the two steps above for the vertical (Y) dimension.
-@item
-Once the images containing the to-be-masked pixels in each dimension are made,
we combine them into one binary image with a final @code{or} operator.
-At this point, the stack only has two operands: 1) the input image and 2) the
binary image that has a value of 1 for all pixels whose value should be changed.
-@item
-A single-element operand (@code{nan}) is added on the stack.
-@item
-Using the @code{where} operator, we replace all the pixels that are non-zero
in the second operand (on the margins) to the top operand's value (NaN) in the
third popped operand (image that was read from @code{image.fits}).
-For more on the @code{where} operator, see @ref{Conditional operators}.
-@end itemize
+@item gt
+Greater than: similar to @code{lt} (`less than' operator), but returning 1
when the second popped operand is greater than the first.
+For example
+@example
+$ astarithmetic image1.fits 1000 gt
+@end example
+@item ge
+Greater or equal: similar to @code{lt} (`less than' operator), but returning 1
when the second popped operand is larger or equal to the first.
+For example
@example
-$ margin=20
-$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
-$ height=$(astfits image.fits --keyvalue=NAXIS2 -q)
-$ astarithmetic image.fits index set-i \
- $width uint16 set-w \
- $height uint16 set-h \
- $margin uint16 set-m \
- i w % uint16 set-X \
- i w / uint16 set-Y \
- X m lt X w m - gt or \
- Y m lt Y h m - gt or \
- or nan where
+$ astarithmetic image1.fits 1000 ge
@end example
-@item Image: Masking regions outside a circle
-As another example for usage on an image, in the command below we are using
@code{index} to define an image where each pixel contains the distance to the
pixel with X,Y coordinates of 345,250.
-We are then using that distance image to only keep the pixels that are within
a 50 pixel radius of that point.
+@item eq
+Equality: similar to @code{lt} (`less than' operator), but returning 1 when
the two popped operands are equal (to double precision floating point accuracy).
+@example
+$ astarithmetic image1.fits 1000 eq
+@end example
-The basic concept behind this process is very similar to the previous example,
with a different mathematical definition for pixels to mask.
-The major difference is that we want the distance to a pixel within the image,
we need to have negative values and the center coordinates can be in a
sub-pixel positions.
-The best numeric datatype for intermediate steps is therefore floating point.
-64-bit floating point can have a precision of up to 15 digits after the
decimal point.
-This is far too much for what we need here: in astronomical imaging, the PSF
is usually on the scale of 1 or more pixels (see @ref{Sampling theorem}).
-So even reaching a precision of one millionth of a pixel (offered by 32-bit
floating points) is beyond our wildest dreams (see @ref{Numeric data types}).
-We will also define the horizontal (X) and vertical (Y) operands after
shifting to the desired central point.
+@item ne
+Non-Equality: similar to @code{lt} (`less than' operator), but returning 1
when the two popped operands are @emph{not} equal (to double precision floating
point accuracy).
+@example
+$ astarithmetic image1.fits 1000 ne
+@end example
+@item and
+Logical AND: returns 1 if both operands have a non-zero value and 0 if both
are zero.
+Both operands have to be the same kind: either both images or both numbers and
it mostly makes meaningful values when the inputs are binary (with pixel values
of 0 or 1).
@example
-$ radius=50
-$ centerx=345.2
-$ centery=250.3
-$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
-$ astarithmetic image.fits index set-i \
- $width uint16 set-w \
- $radius float32 set-r \
- $centerx float32 set-cx \
- $centery float32 set-cy \
- i w % cx - set-X \
- i w / cy - set-Y \
- X X x Y Y x + sqrt r gt \
- nan where --output=arith-masked.fits
+$ astarithmetic image1.fits image2.fits -g1 and
@end example
-@cartouche
-@noindent
-@strong{Optimal data types have significant benefits:} choosing the minimum
required datatype for your operation is very important to avoid wasting your
CPU and RAM.
-Don't simply default to 64-bit floating points for everything!
-Integer operations are much faster than floating points, and within floating
point types, 32-bit is faster and will use half the RAM/storage!
-For more, see @ref{Numeric data types}.
-@end cartouche
+For example, if you only want to see which pixels in an image have a value
@emph{between} 50 (greater equal, or inclusive) and 200 (less than, or
exclusive), you can use this command:
+@example
+$ astarithmetic image.fits set-i i 50 ge i 200 lt and
+@end example
-The example above was just a demo for usage of the @code{index} operator and
some important concepts.
-But it is not the easiest way to achieve the desired result above!
-An easier way for the scenario above (to keep a circle within an image and set
everything else to NaN) is to use MakeProfiles in combination with Arithmetic,
like below:
+@item or
+Logical OR: returns 1 if either one of the operands is non-zero and 0 only
when both operators are zero.
+Both operands have to be the same kind: either both images or both numbers.
+The usage is similar to @code{and}.
+For example, if you only want to see which pixels in an image have a value
@emph{outside of} -100 (greater equal, or inclusive) and 200 (less than, or
exclusive), you can use this command:
@example
-$ radius=50
-$ centerx=345.2
-$ centery=250.3
-$ echo "1 $centerx $centery 5 $radius 0 0 1 1 1" \
- | astmkprof --background=image.fits \
- --mforflatpix --clearcanvas \
- -omkprof-mask.fits --type=uint8
-$ astarithmetic image.fits mkprof-mask.fits not \
- nan where -g1 -omkprof-masked.fits
+$ astarithmetic image.fits set-i i -100 lt i 200 ge or
@end example
-@item Tables: adding new columns with row index
-Within Table, you can use this operator to add an index column like below (see
the @code{counter} operator for starting the count from one).
-
+@item not
+Logical NOT: returns 1 when the operand is 0 and 0 when the operand is
non-zero.
+The operand can be an image or number, for an image, it is applied to each
pixel separately.
+For example, if you want to know which pixels are not blank (and assuming that
we didn't have the @code{isnotblank} operator), you can use this @code{not}
operator on the output of the @command{isblank} operator described below:
@example
-## The index will be the second column.
-$ asttable table.fits -c'arith $1 index'
+$ astarithmetic image.fits isblank not
+@end example
-## The index will be the first column
-$ asttable table.fits -c'arith $1 index swap'
+@cindex Blank pixel
+@item isblank
+Test each pixel for being a blank value (see @ref{Blank pixels}).
+This is a conditional operator: the output has the same size and dimensions as
the input, but has an unsigned 8-bit integer type with two possible values:
either 1 (for a pixel that was blank) or 0 (for a pixel that was not blank).
+See the description of @code{lt} operator above).
+The difference is that it only needs one operand.
+For example:
+@example
+$ astarithmetic image.fits isblank
@end example
-@end table
+Because of the definition of a blank pixel, a blank value is not even equal to
itself, so you cannot use the equal operator above to select blank pixels.
+See the ``Blank pixels'' box below for more on Blank pixels in Arithmetic.
-@item indexonly
-Similar to @code{index}, except that the top operand is popped from the stack
and is no longer available afterwards.
+In case you want to set non-blank pixels to an output pixel value of 1, it is
better to use @code{isnotblank} instead of `@code{isblank not}' (for more, see
the description of @code{isnotblank}).
-@item counter
-Similar to @code{index}, except that counting starts from one (not zero as in
@code{index}).
-Counting from one is usually necessary when adding row counters in tables,
like below:
+@item isnotblank
+The inverse of the @code{isblank} operator above (see that description for
more).
+Therefore, if a pixel has a blank value, the output of this operator will have
a 0 value for it.
+This operator is therefore similar to running `@option{isblank not}', but
slightly more efficient (won't need the intermediate product of two operators).
+
+@item where
+Change the input (pixel) value @emph{where}/if a certain condition holds.
+The conditional operators above can be used to define the condition.
+Three operands are required for @command{where}.
+The input format is demonstrated in this simplified example:
@example
-$ asttable table.fits -c'arith $1 counter swap'
+$ astarithmetic modify.fits binary.fits if-true.fits where
@end example
-@item counteronly
-Similar to @code{counter}, but the top operand before it is popped (no longer
available).
+The value of any pixel in @file{modify.fits} that corresponds to a non-zero
@emph{and} non-blank pixel of @file{binary.fits} will be changed to the value
of the same pixel in @file{if-true.fits} (this may also be a number).
+The 3rd and 2nd popped operands (@file{modify.fits} and @file{binary.fits}
respectively, see @ref{Reverse polish notation}) have to have the same
dimensions/size.
+@file{if-true.fits} can be either a number, or have the same dimension/size as
the other two.
-@item size
-Size of the dataset along a given FITS (or FORTRAN) dimension (counting from
1).
-The desired dimension should be the first popped operand and the dataset must
be the second popped operand.
-The output will be a single unsigned integer (dimensions cannot be negative).
-For example, the following command will produce the size of the first
extension/HDU (the default HDU) of @file{a.fits} along the second FITS axis.
+The 2nd popped operand (@file{binary.fits}) has to have @code{uint8} (or
@code{unsigned char} in standard C) type (see @ref{Numeric data types}).
+It is treated as a binary dataset (with only two values: zero and non-zero,
hence the name @code{binary.fits} in this example).
+However, commonly you will not be dealing with an actual FITS file of a
condition/binary image.
+You will probably define the condition in the same run based on some other
reference image and use the conditional and logical operators above to make a
true/false (or one/zero) image for you internally.
+For example, the case below:
@example
-$ astarithmetic a.fits 2 size
+$ astarithmetic in.fits reference.fits 100 gt new.fits where
@end example
-@cartouche
-@noindent
-@strong{Not optimal:} This operator reads the top element on the stack and
then simply reads its size along the given dimension.
-On a small dataset this won't consume much RAM, but if you want to put this in
a pipeline or use it on large image, the extra RAM and slow operation can
become meaningful.
-To avoid such issues, you can read the size along the given dimension using
the @option{--keyvalue} option of @ref{Keyword inspection and manipulation}.
-For example, in the code below, the X axis position of every pixel is returned:
+In the example above, any of the @file{in.fits} pixels that has a value in
@file{reference.fits} greater than @command{100}, will be replaced with the
corresponding pixel in @file{new.fits}.
+Effectively the @code{reference.fits 100 gt} part created the condition/binary
image which was added to the stack (in memory) and later used by @code{where}.
+The command above is thus equivalent to these two commands:
@example
-$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
-$ astarithmetic image.fits indexonly $width % -opix-x.fits
+$ astarithmetic reference.fits 100 gt --output=binary.fits
+$ astarithmetic in.fits binary.fits new.fits where
@end example
-@end cartouche
-@end table
+Finally, the input operands are read and used independently, so you can use
the same file more than once as any of the operands.
-@node Building new dataset and stack management, Operand storage in memory or
a file, Size and position operators, Arithmetic operators
-@subsubsection Building new dataset and stack management
+When the 1st popped operand to @code{where} (@file{if-true.fits}) is a single
number, it may be a NaN value (or any blank value, depending on its type) like
the example below (see @ref{Blank pixels}).
+When the number is blank, it will be converted to the blank value of the type
of the 3rd popped operand (@code{in.fits}).
+Hence, in the example below, all the pixels in @file{reference.fits} that have
a value greater than 100, will become blank in the natural data type of
@file{in.fits} (even though NaN values are only defined for floating point
types).
-With the operator here, you can create a new dataset from scratch to start
certain operations without any input data.
+@example
+$ astarithmetic in.fits reference.fits 100 gt nan where
+@end example
+@end table
+
+@node Mathematical morphology operators, Bitwise operators, Conditional
operators, Arithmetic operators
+@subsubsection Mathematical morphology operators
+
+@cindex Mathematical morphology
+From Wikipedia: ``Mathematical morphology (MM) is a theory and technique for
the analysis and processing of geometrical structures, based on set theory,
lattice theory, topology, and random functions. MM is most commonly applied to
digital images''.
+In theory it extends a very large body of research and methods in image
processing, but currently in Gnuastro it mainly applies to images that are
binary (only have a value of 0 or 1).
+For example, you have applied the greater-than operator (@code{gt}, see
@ref{Conditional operators}) to select all pixels in your image that are larger
than a value of 100.
+But they will all have a value of 1, and you want to separate the various
groups of pixels that are connected (for example, peaks of stars in your image).
+With the @code{connected-components} operator, you can give each connected
region of the output of @code{gt} a separate integer label.
@table @command
-@item makenew
-Create a new dataset that only has zero values.
-The number of dimensions is read as the first popped operand and the number of
elements along each dimension are the next popped operand (in reverse of the
popping order).
-The type of the new dataset is an unsigned 8-bit integer and all pixel values
have a value of zero.
-For example, if you want to create a new 100 by 200 pixel image, you can run
this command:
+@item erode
+@cindex Erosion
+Erode the foreground pixels (with value @code{1}) of the input dataset (second
popped operand).
+The first popped operand is the connectivity (see description in
@command{connected-components}).
+Erosion is simply a flipping of all foreground pixels (with value @code{1}) to
background (with value @code{0}) that are ``touching'' background pixels.
+``Touching'' is defined by the connectivity.
+In effect, this operator ``carves off'' the outer borders of the foreground,
making them thinner.
+This operator assumes a binary dataset (all pixels are @code{0} or @code{1}).
+For example, imagine that you have an astronomical image with a mean/sky value
of 0 units and a standard deviation (@mymath{\sigma}) of 100 units and many
galaxies in it.
+With the first command below, you can apply a threshold of @mymath{2\sigma} on
the image (by only keeping pixels that are greater than 200 using the
@command{gt} operator).
+The output of thresholding the image is a binary image (each pixel is either
smaller or equal to the threshold or larger than it).
+You can then erode the binary image with the second command below to remove
very small false positives (one or two pixel peaks).
@example
-$ astarithmetic 100 200 2 makenew
+$ astarithmetic image.fits 100 gt -obinary.fits
+$ astarithmetic binary.fits 2 erode -oout.fits
@end example
-@noindent
-To further extend the example, you can use any of the noise-making operators
to add noise to this new dataset (see @ref{Random number generators}), like the
command below:
+In fact, you can merge these operations into one command thanks to the reverse
polish notation (see @ref{Reverse polish notation}):
+@example
+$ astarithmetic image.fits 100 gt 2 erode -oout.fits
+@end example
+To see the effect of connectivity, try this:
@example
-$ astarithmetic 100 200 2 makenew 5 mknoise-sigma
+$ astarithmetic image.fits 100 gt 1 erode -oout-con-1.fits
@end example
-@item constant
-Return an operand that will have a constant value (first popped operand) in
all its elements.
-The number of elements is read from the second popped operand.
-The second popped operand is only used for its number of elements, its numeric
data type, or its values are fully ignored and it is later freed.
+@item dilate
+@cindex Dilation
+Dilate the foreground pixels (with value @code{1}) of the binary input dataset
(second popped operand).
+The first popped operand is the connectivity (see description in
@command{connected-components}).
+Dilation is simply a flipping of all background pixels (with value @code{0})
to foreground (with value @code{1}) that are ``touching'' foreground pixels.
+``Touching'' is defined by the connectivity.
+In effect, this expands the outer borders of the foreground.
+This operator assumes a binary dataset (all pixels are @code{0} and @code{1}).
+The usage is similar to @code{erode}, for example:
+@example
+$ astarithmetic binary.fits 2 dilate -oout.fits
+@end example
-@cindex Provenance
-Here is one useful scenario for this operator in tables: you want to merge the
objects/rows of some catalogs together, but you first want to give each source
catalog a label/counter that distinguishes between the source of each rows in
the merged/final catalog (using @ref{Invoking asttable}).
-The steps below show the the usage of this.
+@item number-neighbors
+Return a dataset of the same size as the second popped operand, but where each
non-zero and non-blank input pixel is replaced with the number of its non-zero
and non-blank neighbors.
+The first popped operand is the connectivity (see above) and must be a
single-value of an integer type.
+The dataset is assumed to be binary (having an unsigned, 8-bit dataset).
-@example
-## Add label 1 to the RA, Dec, magnitude and magnitude error
-## rows of the first catalog.
-$ asttable cat-1.fits -cRA,DEC,MAG,MAG_ERR \
- -c'arith $1 1 constant' --output=tab-1.fits
+For example with the command below, you can select all pixels above a value of
100 in your image with the ``greater-than'' or @code{gt} operator (see
@ref{Conditional operators}).
+Recall that the output of all conditional operators is a binary output (having
a value of 0 or 1).
+In the same command, we will then find how many neighboring pixels of each
pixel (that was originally above the threshold) are also above the threshold.
-## Similar to above, but for the second catalog.
-$ asttable cat-2.fits -cRA,DEC,MAG,MAG_ERR \
- -c'arith $1 2 constant' --output=tab-2.fits
+@example
+$ astarithmetic image.fits 100 gt 2 number-neighbors
+@end example
-## Concatenate (merge/blend) the rows of the two tables into
-## one for the 5 columns, but also add a counter for each
-## object or row in the final catalog.
-$ asttable tab-1.fits --catrowfile=tab-2.fits \
- -c'arith $1 counteronly' \
- -cRA,DEC,MAG,MAG_ERR,5 --output=merged.fits \
- --colmetadata=1,ID_MERGED,counter,"Merged ID." \
- --colmetadata=6,SOURCE-CAT,counter,"Source ID."
+@item connected-components
+@cindex Connected components
+Find the connected components in the input dataset (second popped operand).
+The first popped is the connectivity used in the connected components
algorithm.
+The second popped operand is the dataset where connected components are to be
found.
+It is assumed to be a binary image (with values of 0 or 1).
+It must have an 8-bit unsigned integer type which is the format produced by
conditional operators.
+This operator will return a labeled dataset where the non-zero pixels in the
input will be labeled with a counter (starting from 1).
-## Add keyword information on each input. It is very important
-## to preserve this within the merged catalog. If the tables
-## came from public databases (for example on VizieR), give
-## their public identifier as the value.
-$ astfits merged.fits --write=/,"Source catalogs" \
- --write=CATSRC1,"I/355/gaiadr3","VizieR ID." \
- --write=CATSRC2,"Jane Doe","Name of source."
+The connectivity is a number between 1 and the number of dimensions in the
dataset (inclusive).
+1 corresponds to the weakest (symmetric) connectivity between elements and the
number of dimensions the strongest.
+For example, on a 2D image, a connectivity of 1 corresponds to 4-connected
neighbors and 2 corresponds to 8-connected neighbors.
-## Check the metadata in 'merged.fits' and clean the
-## temporary files.
-$ rm tab-1.fits tab-2.fits
-$ astfits merged.fits -h1
+One example usage of this operator can be the identification of regions above
a certain threshold, as in the command below.
+With this command, Arithmetic will first separate all pixels greater than 100
into a binary image (where pixels with a value of 1 are above that value).
+Afterwards, it will label all those that are connected.
+
+@example
+$ astarithmetic in.fits 100 gt 2 connected-components
@end example
-Like most operators, @code{constant} is not limited to tables, you can also
apply it on images.
-In the example below, we'll use @code{constant} to set all the pixels of the
input image to NaN (which is necessary in scenarios that you need to include in
an image in an analysis, but you don't want its pixels to affect the
processing):
+If your input dataset does not have a binary type, but you know all its values
are 0 or 1, you can use the @code{uint8} operator (below) to convert it to
binary.
+@item fill-holes
+Flip background (0) pixels surrounded by foreground (1) in a binary dataset.
+This operator takes two operands (similar to @code{connected-components}): the
second is the binary (0 or 1 valued) dataset to fill holes in and the first
popped operand is the connectivity (to define a hole).
+Imagine that in your dataset there are some holes with zero value inside the
objects with one value (for example, the output of the thresholding example of
@command{erode}) and you want to fill the holes:
@example
-$ astarithmetic image.fits nan constant
+$ astarithmetic binary.fits 2 fill-holes
@end example
-@item swap
-Swap the top two operands on the stack.
-For example the @code{index} operator doesn't pop with the top operand (the
input to @code{index}), it just adds the index image to the stack.
-In case you want your next operation to be on the input to @code{index}, you
can simply call @code{swap} and continue the operations on that image, while
keeping the indexed pixels for later steps.
-In the example below we are using the @option{--writeall} option to write the
full stack and if you open the outputs you will see that the stack order has
changed.
+@item invert
+Invert an unsigned integer dataset (will not work on other data types, see
@ref{Numeric data types}).
+This is the only operator that ignores blank values (which are set to be the
maximum values in the unsigned integer types).
+This is useful in cases where the target(s) has(have) been imaged in
absorption as raw formats (which are unsigned integer types).
+With this option, the maximum value for the given type will be subtracted from
each pixel value, thus ``inverting'' the image, so the target(s) can be treated
as emission.
+This can be useful when the higher-level analysis methods/tools only work on
emission (positive skew in the noise, not negative).
@example
-## Index image is written in HDU 1.
-$ astarithmetic image.fits index --writeall \
- --output=ind-first.fits
-
-## image.fits in HDU 1.
-$ astarithmetic image.fits index swap --writeall \
- --output=img-first.fits
+$ astarithmetic image.fits invert
@end example
@end table
-@node Operand storage in memory or a file, , Building new dataset and stack
management, Arithmetic operators
-@subsubsection Operand storage in memory or a file
+@node Bitwise operators, Numerical type conversion operators, Mathematical
morphology operators, Arithmetic operators
+@subsubsection Bitwise operators
-In your early days of using Gnuastro, to do multiple operations, it is likely
that you will simply call Arithmetic (or Table, with column arithmetic)
multiple times: feed the output file of the first call to the second call.
-But as you get more proficient in the reverse polish notation, you will find
yourself combining many operations into one call.
-This greatly speeds up your operation, because instead of writing the dataset
to a file in one command, and reading it in the next command, it will just keep
the intermediate dataset in memory!
+@cindex Bitwise operators
+Astronomical images are usually stored as an array multi-byte pixels with
different sizes for different precision levels (see @ref{Numeric data types}).
+For example, images from CCDs are usually in the unsigned 16-bit integer type
(each pixel takes 16 bits, or 2 bytes, of memory) and fully reduced deep images
have a 32-bit floating point type (each pixel takes 32 bits or 4 bytes).
-But adding more complexity to your operations, can make them much harder to
debug, or extend even further.
-Therefore in this section we have some special operators that behave
differently from the rest: they do not touch the contents of the data, only
where/how they are stored.
-They are designed to do complex operations, without necessarily having a
complex command.
+On the other hand, during the data reduction, we need to preserve a lot of
meta-data about some pixels.
+For example, if a cosmic ray had hit the pixel during the exposure, or if the
pixel was saturated, or is known to have a problem, or if the optical
vignetting is too strong on it.
+A crude solution is to make a new image when checking for each one of these
things and make a binary image where we flag (set to 1) pixels that satisfy any
of these conditions above, and set the rest to zero.
+However, processing pipelines sometimes need more than 20 flags to store
important per-pixel meta-data, and recall that the smallest numeric data type
is one byte (or 8 bits, that can store up to 256 different values), while we
only need two values for each flag!
+This is a major waste of storage space!
-@table @command
-@item set-AAA
-Set the characters after the dash (@code{AAA} in the case shown here) as a
name for the first popped operand on the stack.
-The named dataset will be freed from memory as soon as it is no longer needed,
or if the name is reset to refer to another dataset later in the command.
-This operator thus enables reusability of a dataset without having to reread
it from a file every time it is necessary during a process.
-When a dataset is necessary more than once, this operator can thus help
simplify reading/writing on the command-line (thus avoiding potential bugs),
while also speeding up the processing.
+@cindex Flag (mask) images
+@cindex Mask (flag) images
+A much more optimal solution is to use the bits within each pixel to store
different flags!
+In other words, if you have an 8-bit pixel, use each bit as a flag to mark if
a certain condition has happened on a certain pixel or not.
+For example, let's set the following standard based on the four cases
mentioned above: the first bit will show that a cosmic ray has hit that pixel.
+So if a pixel is only affected by cosmic rays, it will have this sequence of
bits (note that the bit-counting starts from the right): @code{00000001}.
+The second bit shows that the pixel was saturated (@code{00000010}), the third
bit shows that it has known problems (@code{00000100}) and the fourth bit shows
that it was affected by vignetting (@code{00001000}).
-Like all operators, this operator pops the top operand off of the main
processing stack, but unlike other operands, it will not add anything back to
the stack immediately.
-It will keep the popped dataset in memory through a separate list of named
datasets (not on the main stack).
-That list will be used to add/copy any requested dataset to the main
processing stack when the name is called.
+Since each bit is independent, we can thus mark multiple metadata about that
pixel in the actual image, within a single ``flag'' or ``mask'' pixel of a flag
or mask image that has the same number of pixels.
+For example, a flag-pixel with the following bits @code{00001001} shows that
it has been affected by cosmic rays @emph{and} it has been affected by
vignetting at the same time.
+The common data type to store these flagging pixels are unsigned integer types
(see @ref{Numeric data types}).
+Therefore when you open an unsigned 8-bit flag image in a viewer like DS9, you
will see a single integer in each pixel that actually has 8 layers of metadata
in it!
+For example, the integer you will see for the bit sequences given above will
respectively be: @mymath{2^0=1} (for a pixel that only has cosmic ray),
@mymath{2^1=2} (for a pixel that was only saturated), @mymath{2^2=4} (for a
pixel that only has known problems), @mymath{2^3=8} (for a pixel that is only
affected by vignetting) and @mymath{2^0 + 2^3 = 9} (for a pixel that has a
cosmic ray @emph{and} was affected by vignetting).
-The name to give the popped dataset is part of the operator's name.
-For example, the @code{set-a} operator of the command below, gives the name
``@code{a}'' to the contents of @file{image.fits}.
-This name is then used instead of the actual filename to multiply the dataset
by two.
+You can later use this bit information to mark objects in your final analysis
or to mask certain pixels.
+For example, you may want to set all pixels affected by vignetting to NaN, but
can interpolate over cosmic rays.
+You therefore need ways to separate the pixels with a desired flag(s) from the
rest.
+It is possible to treat a flag pixel as a single integer (and try to define
certain ranges in value to select certain flags).
+But a much more easier and robust way is to actually look at each pixel as a
sequence of bits (not as a single integer!) and use the bitwise operators below
for this job.
+For more on the theory behind bitwise operators, see
@url{https://en.wikipedia.org/wiki/Bitwise_operation, Wikipedia}.
-@example
-$ astarithmetic image.fits set-a a 2 x
-@end example
-The name can be any string, but avoid strings ending with standard filename
suffixes (for example, @file{.fits})@footnote{A dataset name like @file{a.fits}
(which can be set with @command{set-a.fits}) will cause confusion in the
initial parser of Arithmetic.
-It will assume this name is a FITS file, and if it is used multiple times,
Arithmetic will abort, complaining that you have not provided enough HDUs.}.
+@table @command
+@item bitand
+Bitwise AND operator: only bits with values of 1 in both popped operands will
get the value of 1, the rest will be set to 0.
+For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 00100010 bitand} will give @code{00100000}.
+Note that the bitwise operators only work on integer type datasets.
-One example of the usefulness of this operator is in the @code{where} operator.
-For example, let's assume you want to mask all pixels larger than @code{5} in
@file{image.fits} (extension number 1) with a NaN value.
-Without setting a name for the dataset, you have to read the file two times
from memory in a command like this:
+@item bitor
+Bitwise inclusive OR operator: The bits where at least one of the two popped
operands has a 1 value get a value of 1, the others 0.
+For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 00100010 bitand} will give @code{00101010}.
+Note that the bitwise operators only work on integer type datasets.
-@example
-$ astarithmetic image.fits image.fits 5 gt nan where -g1
-@end example
+@item bitxor
+Bitwise exclusive OR operator: A bit will be 1 if it differs between the two
popped operands.
+For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 00100010 bitand} will give @code{00001010}.
+Note that the bitwise operators only work on integer type datasets.
-But with this operator you can simply give @file{image.fits} the name @code{i}
and simplify the command above to the more readable one below (which greatly
helps when the filename is long):
+@item lshift
+Bitwise left shift operator: shift all the bits of the first operand to the
left by a number of times given by the second operand.
+For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 2 lshift} will give @code{10100000}.
+This is equivalent to multiplication by 4.
+Note that the bitwise operators only work on integer type datasets.
-@example
-$ astarithmetic image.fits set-i i i 5 gt nan where
-@end example
+@item rshift
+Bitwise right shift operator: shift all the bits of the first operand to the
right by a number of times given by the second operand.
+For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 2 rshift} will give @code{00001010}.
+Note that the bitwise operators only work on integer type datasets.
-@item repeat
-Add N copies of the second popped operand to the stack of operands.
-N is the first popped operand.
-For example, let's assume @file{image.fits} is a @mymath{100\times100} image.
-The output of the command below will be a 3D datacube of size
@mymath{100\times100\times20} voxels (volume-pixels):
+@item bitnot
+Bitwise not (more formally known as one's complement) operator: flip all the
bits of the popped operand (note that this is the only unary, or single
operand, bitwise operator).
+In other words, any bit with a value of @code{0} is changed to @code{1} and
vice-versa.
+For example, (assuming numbers can be written as bit strings on the
command-line): @code{00101000 bitnot} will give @code{11010111}.
+Note that the bitwise operators only work on integer type datasets/numbers.
+@end table
-@example
-$ astarithmetic image.fits 20 repeat 20 add-dimension-slow
-@end example
+@node Numerical type conversion operators, Random number generators, Bitwise
operators, Arithmetic operators
+@subsubsection Numerical type conversion operators
-@item tofile-AAA
-Write the top operand on the operands stack into a file called @code{AAA} (can
be any FITS file name) without changing the operands stack.
-If you do not need the dataset any more and would like to free it, see the
@code{tofilefree} operator below.
+With the operators below you can convert the numerical data type of your
input, see @ref{Numeric data types}.
+Type conversion is particularly useful when dealing with integers, see
@ref{Integer benefits and pitfalls}.
-By default, any file that is given to this operator is deleted before
Arithmetic actually starts working on the input datasets.
-The deletion can be deactivated with the @option{--dontdelete} option (as in
all Gnuastro programs, see @ref{Input output options}).
-If the same FITS file is given to this operator multiple times, it will
contain multiple extensions (in the same order that it was called.
+As an example, let's assume that your colleague gives you many single exposure
images for processing, but they have a double-precision floating point type!
+You know that the statistical error a single-exposure image can never exceed 6
or 7 significant digits, so you would prefer to archive them as a
single-precision floating point and save space on your computer (a
double-precision floating point is also double the file size!).
+You can do this with the @code{float32} operator described below.
-For example, the operator @command{tofile-check.fits} will write the top
operand to @file{check.fits}.
-Since it does not modify the operands stack, this operator is very convenient
when you want to debug, or understanding, a string of operators and operands
given to Arithmetic: simply put @command{tofile-AAA} anywhere in the process to
see what is happening behind the scenes without modifying the overall process.
+@table @command
-@item tofilefree-AAA
-Similar to the @code{tofile} operator, with the only difference that the
dataset that is written to a file is popped from the operand stack and freed
from memory (cannot be used any more).
+@item u8
+@itemx uint8
+Convert the type of the popped operand to 8-bit unsigned integer type (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
-@end table
+@item i8
+@itemx int8
+Convert the type of the popped operand to 8-bit signed integer type (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
+
+@item u16
+@itemx uint16
+Convert the type of the popped operand to 16-bit unsigned integer type (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
+@item i16
+@itemx int16
+Convert the type of the popped operand to 16-bit signed integer (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
-@node Invoking astarithmetic, , Arithmetic operators, Arithmetic
-@subsection Invoking Arithmetic
+@item u32
+@itemx uint32
+Convert the type of the popped operand to 32-bit unsigned integer type (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
-Arithmetic will do pixel to pixel arithmetic operations on the individual
pixels of input data and/or numbers.
-For the full list of operators with explanations, please see @ref{Arithmetic
operators}.
-Any operand that only has a single element (number, or single pixel FITS
image) will be read as a number, the rest of the inputs must have the same
dimensions.
-The general template is:
+@item i32
+@itemx int32
+Convert the type of the popped operand to 32-bit signed integer type (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
-@example
-$ astarithmetic [OPTION...] ASTRdata1 [ASTRdata2] OPERATOR ...
-@end example
+@item u64
+@itemx uint64
+Convert the type of the popped operand to 64-bit unsigned integer (see
@ref{Numeric data types}).
+The internal conversion of C will be used.
-@noindent
-One line examples:
+@item f32
+@itemx float32
+Convert the type of the popped operand to 32-bit (single precision) floating
point (see @ref{Numeric data types}).
+The internal conversion of C will be used.
+For example, if @file{f64.fits} is a 64-bit floating point image, and you want
to store it as a 32-bit floating point image, you can use the command below
(the second command is to show that the output file consumes half the storage)
@example
-## Calculate (10.32-3.84)^2.7 quietly (will just print 155.329):
-$ astarithmetic -q 10.32 3.84 - 2.7 pow
-
-## Inverse the input image (1/pixel):
-$ astarithmetic 1 image.fits / --out=inverse.fits
+$ astarithmetic f64.fits float32 --output=f32.fits
+$ ls -lh f64.fits f32.fits
+@end example
-## Multiply each pixel in image by -1:
-$ astarithmetic image.fits -1 x --out=negative.fits
+@item f64
+@itemx float64
+Convert the type of the popped operand to 64-bit (double precision) floating
point (see @ref{Numeric data types}).
+The internal conversion of C will be used.
+@end table
-## Subtract extension 4 from extension 1 (counting from zero):
-$ astarithmetic image.fits image.fits - --out=skysub.fits \
- --hdu=1 --hdu=4
+@node Random number generators, Box shape operators, Numerical type conversion
operators, Arithmetic operators
+@subsubsection Random number generators
-## Add two images, then divide them by 2 (2 is read as floating point):
-## Note that without the '.0', the '2' will be read/used as an integer.
-$ astarithmetic image1.fits image2.fits + 2.0 / --out=average.fits
+When you simulate data (for example, see @ref{Sufi simulates a detection}),
everything is ideal and there is no noise!
+The final step of the process is to add simulated noise to the data.
+The operators in this section are designed for that purpose.
-## Use Arithmetic's average operator:
-$ astarithmetic image1.fits image2.fits average --out=average.fits
+@table @command
-## Calculate the median of three images in three separate extensions:
-$ astarithmetic img1.fits img2.fits img3.fits median \
- -h0 -h1 -h2 --out=median.fits
-@end example
+@item mknoise-sigma
+Add a fixed noise (Gaussian standard deviation) to each element of the input
dataset.
+This operator takes two arguments: the top/first popped operand is the noise
standard deviation, the next popped operand is the dataset that the noise
should be added to.
-Arithmetic's notation for giving operands to operators is fully described in
@ref{Reverse polish notation}.
-The output dataset is last remaining operand on the stack.
-When the output dataset a single number, and @option{--output} is not called,
it will be printed on the standard output (command-line).
-When the output is an array, it will be stored as a file.
+When @option{--quiet} is not given, a statement will be printed on each
invocation of this operator (if there are multiple calls to the
@code{mknoise-*}, the statement will be printed multiple times).
+It will show the random number generator function and seed that was used in
that invocation, see @ref{Generating random numbers}.
+Reproducibility of the outputs can be ensured with the @option{--envseed}
option, see below for more.
-The name of the final file can be specified with the @option{--output} option,
but if it is not given (and the output dataset has more than one element),
Arithmetic will use ``automatic output'' on the name of the first FITS image
encountered to generate an output file name, see @ref{Automatic output}.
-By default, if the output file already exists, it will be deleted before
Arithmetic starts operation.
-However, this can be disabled with the @option{--dontdelete} option (see
below).
-At any point during Arithmetic's operation, you can also write the top operand
on the stack to a file, using the @code{tofile} or @code{tofilefree} operators,
see @ref{Arithmetic operators}.
+For example, with the first command below, @file{image.fits} will be degraded
by a noise of standard deviation 3 units.
+@example
+$ astarithmetic image.fits 3 mknoise-sigma
+@end example
-By default, the world coordinate system (WCS) information of the output
dataset will be taken from the first input image (that contains a WCS) on the
command-line.
-This can be modified with the @option{--wcsfile} and @option{--wcshdu} options
described below.
-When the @option{--quiet} option is not given, the name and extension of the
dataset used for the output's WCS is printed on the command-line.
+Alternatively, you can use this operator within column arithmetic in the Table
program, to generate a random number like below (centered on 0, with
@mymath{\sigma=3}) like the first command below.
+With the second command, you can put it into a shell variable for later usage.
-Through operators like those starting with @code{collapse-}, the
dimensionality of the inputs may not be the same as the outputs.
-By default, when the output is 1D, Arithmetic will write it as a table, not an
image/array.
-The format of the output table (plain text or FITS ASCII or binary) can be set
with the @option{--tableformat} option, see @ref{Input output options}).
-You can disable this feature (write 1D arrays as FITS images/arrays, or to the
standard output) with the @option{--onedasimage} or @option{--onedonstdout}
options.
+@example
+$ echo 0 | asttable -c'arith $1 3 mknoise-sigma'
+$ value=$(echo 0 | asttable -c'arith $1 3 mknoise-sigma')
+$ echo $value
+@end example
-See @ref{Common options} for a review of the options in all Gnuastro programs.
-Arithmetic just redefines the @option{--hdu} and @option{--dontdelete} options
as explained below.
+You can also use this operator in combination with AWK to easily generate an
arbitrarily large table with random columns.
+In the example below, we will create a two column table with 20 rows.
+The first column will be centered on 5 and @mymath{\sigma_1=2}, the second
will be centered on 10 and @mymath{\sigma_2=3}:
-@table @option
+@example
+$ echo 5 10 \
+ | awk '@{for(i=0;i<20;++i) print $1, $2@}' \
+ | asttable -c'arith $1 2 mknoise-sigma' \
+ -c'arith $2 3 mknoise-sigma'
+@end example
-@item -h INT/STR
-@itemx --hdu INT/STR
-The header data unit of the input FITS images, see @ref{Input output options}.
-Unlike most options in Gnuastro (which will ultimately only have one value for
this option), Arithmetic allows @option{--hdu} to be called multiple times and
the value of each invocation will be stored separately (for the unlimited
number of input images you would like to use).
-Recall that for other programs this (common) option only takes a single value.
-So in other programs, if you specify it multiple times on the command-line,
only the last value will be used and in the configuration files, it will be
ignored if it already has a value.
+By adding an extra @option{--output=random.fits}, the table will be saved into
a file called @file{random.fits}, and you can change the @code{i<20} to
@code{i<5000} to have 5000 rows instead.
+Of course, if your input table has different values in the desired column the
noisy distribution will be centered on each input element, but all will have
the same scatter/sigma.
-The order of the values to @option{--hdu} has to be in the same order as input
FITS images.
-Options are first read from the command-line (from left to right), then
top-down in each configuration file, see @ref{Configuration file precedence}.
+You can use the @option{--envseed} option to fix the random number generator
seed (and thus get a reproducible result).
+For more on @option{--envseed}, see @ref{Generating random numbers}.
+When using column arithmetic in Table, it may happen that multiple columns
need random numbers (with any of the @code{mknoise-*} operators) in one call of
@command{asttable}.
+In such cases, the value given to @code{GSL_RNG_SEED} is incremented by one on
every call to the @code{mknoise-*} operators.
+Without this increment, when the column values are the same (happens a lot,
for no-noised datasets), the returned values for all columns will be identical.
+But this feature has a side-effect: that if the order of calling the
@code{mknoise-*} operators changes, the seeds used for each operator will
change@footnote{We have defined @url{https://savannah.gnu.org/task/?15971, Task
15971} in Gnuastro's project management system to address this.
+If you need this feature please send us an email at
@code{bug-gnuastro@@gnu.org} (to motivate us in its implementation).}.
-If the number of HDUs is less than the number of input images, Arithmetic will
abort and notify you.
-However, if there are more HDUs than FITS images, there is no problem: they
will be used in the given order (every time a FITS image comes up on the stack)
and the extra HDUs will be ignored in the end.
-So there is no problem with having extra HDUs in the configuration files and
by default several HDUs with a value of @option{0} are kept in the system-wide
configuration file when you install Gnuastro.
+In case each data element should have an independent sigma, the first popped
operand can be a dataset of the same size as the second.
+In this case, for each element, a different noise measure (for example, sigma
in @code{mknoise-sigma}) will be used.
-@item -g INT/STR
-@itemx --globalhdu INT/STR
-Use the value to this option as the HDU of all input FITS files.
-This option is very convenient when you have many input files and the dataset
of interest is in the same HDU of all the files.
-When this option is called, any values given to the @option{--hdu} option
(explained above) are ignored and will not be used.
+@item mknoise-poisson
+@cindex Poisson noise
+Add Poisson noise to each element of the input dataset (see @ref{Photon
counting noise}).
+This operator takes two arguments: 1. the first popped operand (just before
the operator) is the @emph{per-pixel} background value (in units of electron
counts).
+2. The second popped operand is the dataset that the noise should be added to.
-@item -w FITS
-@itemx --wcsfile FITS
-FITS Filename containing the WCS structure that must be written to the output.
-The HDU/extension should be specified with @option{--wcshdu}.
+@cindex Dark night
+@cindex Gray night
+@cindex Nights (dark or gray)
+Recall that the background values reported by observatories (for example, to
define dark or gray nights), or in papers, is usually reported in units of
magnitudes per arcseconds square.
+You need to do the conversion to counts per pixel manually.
+The conversion of magnitudes to counts is described below.
+For converting arcseconds squared to number of pixels, you can use the
@option{--pixelscale} option of @ref{Fits}.
+For example, @code{astfits image.fits --pixelscale}.
-When this option is used, the respective WCS will be read before any
processing is done on the command-line and directly used in the final output.
-If the given file does not have any WCS, then the default WCS (first file on
the command-line with WCS) will be used in the output.
+Except for the noise-model, this operator is very similar to
@code{mknoise-sigma} and the examples there apply here too.
+The main difference with @code{mknoise-sigma} is that in a Poisson
distribution the scatter/sigma will depend on each element's value.
-This option will mostly be used when the default file (first of the set of
inputs) is not the one containing your desired WCS.
-But with this option, you can also use Arithmetic to rewrite/change the WCS of
an existing FITS dataset from another file:
+For example, let's assume you have made a mock image called @file{mock.fits}
with @ref{MakeProfiles} and it is assumed zero point is 22.5 (for more on the
zero point, see @ref{Brightness flux magnitude}).
+Let's assume the background level for the Poisson noise has a value of 19
magnitudes.
+You can first use the @code{mag-to-counts} operator to convert this background
magnitude into counts, then feed the background value in counts to
@code{mknoise-poisson} operator:
@example
-$ astarithmetic data.fits --wcsfile=other.fits -ofinal.fits
+$ astarithmetic mock.fits 19 22.5 mag-to-counts \
+ mknoise-poisson
@end example
-@item -W STR
-@itemx --wcshdu STR
-HDU/extension to read the WCS within the file given to @option{--wcsfile}.
-For more, see the description of @option{--wcsfile}.
+Try changing the background value from 19 to 10 to see the effect!
+Recall that the tutorial @ref{Sufi simulates a detection} shows how you can
use MakeProfiles to build mock images.
-@item --envseed
-Use the environment for the random number generator settings in operators that
need them (for example, @code{mknoise-sigma}).
-This is very important for obtaining reproducible results, for more see
@ref{Generating random numbers}.
+@item mknoise-uniform
+Add uniform noise to each element of the input dataset.
+This operator takes two arguments: the top/first popped operand is the width
of the interval, the second popped operand is the dataset that the noise should
be added to (each element will be the center of the interval).
+The returned random values may happen to be the minimum interval value, but
will never be the maximum.
+Except for the noise-model, this operator behaves very similar to
@code{mknoise-sigma}, see the explanation there for more.
-@item -n STR
-@itemx --metaname=STR
-Metadata (name) of the output dataset.
-For a FITS image or table, the string given to this option is written in the
@code{EXTNAME} or @code{TTYPE1} keyword (respectively).
+For example, with the command below, a random value will be selected between
10 to 14 (centered on 12, which is the only input data element, with a total
width of 4).
-If this keyword is present in a FITS extension, it will be printed in the
table output of a command like @command{astfits image.fits} (for images) or
@command{asttable table.fits -i} (for tables).
-This metadata can be very helpful for yourself in the future (when you have
forgotten the details), so it is recommended to use this option for files that
should be archived or shared with colleagues.
+@example
+echo 12 | asttable -c'arith $1 4 mknoise-uniform'
+@end example
-@item -u STR
-@itemx --metaunit=STR
-Metadata (units) of the output dataset.
-For a FITS image or table, the string given to this option is written in the
@code{BUNIT} or @code{TTYPE1} keyword respectively.
-In the case of tables, recall that the Arithmetic program only outputs a
single column, you should use column arithmetic in Table for more than one
column (see @ref{Column arithmetic}).
-For more on the importance of metadata, see the description of
@option{--metaname}.
+Similar to the example in @code{mknoise-sigma}, you can pipe the output of
@command{echo} to @command{awk} before passing it to @command{asttable} to
generate a full column of uniformly selected values within the same interval.
-@item -c STR
-@itemx --metacomment=STR
-Metadata (comments) of the output dataset.
-For a FITS image or table, the string given to this option is written in the
@code{COMMENT} or @code{TCOMM1} keyword respectively.
-In the case of tables, recall that the Arithmetic program only outputs a
single column, you should use column arithmetic in Table for more than one
column (see @ref{Column arithmetic}).
-For more on the importance of metadata, see the description of
@option{--metaname}.
+@item random-from-hist-raw
+Generate random values from a custom distribution (defined by a histogram).
+The output will have a double-precision floating point type (see @ref{Numeric
data types}).
+This operator takes three operands:
+@itemize
+@item
+The first popped operand (nearest to the operator) is the histogram values.
+The histogram is a 1-dimensional dataset (a table column) and contains the
probability of obtaining a certain interval of values.
+The histogram does not have to be normalized: the GNU Scientific Library (or
GSL, which is used by Gnuastro for this operator), will normalize it internally.
+The value of each bin (whose probability is given in the histogram) is given
in the second popped operand.
+Therefore these two operands have to have the same number of rows.
+@item
+The second popped operand is the bin value (mostly the bin center, but it can
be anything).
+The probability of each bin is defined in the histogram operand (first popped
operand).
+The bins can have any width (do not have to be evenly spaced), and any order.
+Just make sure that the same row in the bins column corresponds to the same
row in the histogram: the number of rows in the bins and histogram must be
equal.
+@item
+The third popped operand is the dataset that the random values should be
written over.
+Effectively only its size will be used by this operator (all values will be
over-written as a double-precision floating point number).
+@end itemize
-@item -O
-@itemx --onedasimage
-Write final dataset as a FITS image/array even if it has a single dimension.
-By default, if the output is 1D, it will be written as a table, see above.
-If the output has more than one dimension, this option is redundant.
+The first two operands have to be single-dimensional (a table column) and have
the same number of rows, but the last popped operand can have any number of
dimensions.
+You can use the @code{load-col-} operator to load the two bins and histogram
columns from an external file (see @ref{Loading external columns}).
-@item -s
-@itemx --onedonstdout
-Write final dataset (only when it is 1D) to standard output, not as a file.
-By default 1D datasets will be written as a table, see above.
-If the output has more than one dimension, this option is redundant.
+For example, in the command below, we first construct a fake histogram to
represent a @mymath{y=x^2} distribution with AWK.
+We aim to distribute random values from this distribution in a
@mymath{100\times100} image.
+Therefore, we use the @command{makenew} operator to construct an empty image
of that size, use the @command{load-col-} operator to load the histogram
columns into Arithmetic and put the output in @file{random.fits}.
+Finally we visually inspect @file{random.fits} with DS9 and also have a look
at its pixel distribution with @command{aststatistics}.
-@item -D
-@itemx --dontdelete
-Do Not delete the output file, or files given to the @code{tofile} or
@code{tofilefree} operators, if they already exist.
-Instead append the desired datasets to the extensions that already exist in
the respective file.
-Note it does not matter if the final output file name is given with the
@option{--output} option, or determined automatically.
+@example
+$ echo "" | awk '@{for(i=1;i<5;++i) print i, i*i@}' \
+ > histogram.txt
-Arithmetic treats this option differently from its default operation in other
Gnuastro programs (see @ref{Input output options}).
-If the output file exists, when other Gnuastro programs are called with
@option{--dontdelete}, they simply complain and abort.
-But when Arithmetic is called with @option{--dontdelete}, it will appended the
dataset(s) to the existing extension(s) in the file.
+$ cat histogram.txt
+1 1
+2 4
+3 9
+4 16
-@item -a
-@itemx --writeall
-Write all datasets on the stack as separate HDUs in the output file.
-This only affects datasets with multiple dimensions (or single-dimension
datasets when the @option{--onedasimg} is called).
-This option is useful to debug Arithmetic calls: to check all the images on
the stack while you are designing your operation.
-The top dataset on the stack will be on HDU number 1 of the output, the second
dataset will be on HDU number 2 and so on.
-@end table
+$ astarithmetic 100 100 2 makenew \
+ load-col-1-from-histogram.txt \
+ load-col-2-from-histogram.txt \
+ random-from-hist-raw \
+ --output=random.fits
-Arithmetic accepts two kinds of input: images and numbers.
-Images are considered to be any of the inputs that is a file name of a
recognized type (see @ref{Arguments}) and has more than one element/pixel.
-Numbers on the command-line will be read into the smallest type (see
@ref{Numeric data types}) that can store them, so @command{-2} will be read as
a @code{char} type (which is signed on most systems and can thus keep negative
values), @command{2500} will be read as an @code{unsigned short} (all positive
numbers will be read as unsigned), while @code{3.1415926535897} will be read as
a @code{double} and @code{3.14} will be read as a @code{float}.
-To force a number to be read as float, put a @code{.} after it (possibly
followed by a zero for easier readability), or add an @code{f} after it.
-Hence while @command{5} will be read as an integer, @command{5.},
@command{5.0} or @command{5f} will be added to the stack as @code{float} (see
@ref{Reverse polish notation}).
+$ astscript-fits-view random.fits
-Unless otherwise stated (in @ref{Arithmetic operators}), the operators can
deal with numeric multiple data types (see @ref{Numeric data types}).
-For example, in ``@command{a.fits b.fits +}'', the image types can be
@code{long} and @code{float}.
-In such cases, C's internal type conversion will be used.
-The output type will be set to the higher-ranking type of the two inputs.
-Unsigned integer types have smaller ranking than their signed counterparts and
floating point types have higher ranking than the integer types.
-So the internal C type conversions done in the example above are equivalent to
this piece of C:
+$ aststatistics random.fits --asciihist --numasciibins=50
+ | *
+ | *
+ | *
+ | *
+ | * *
+ | * *
+ | * *
+ | * * *
+ | * * *
+ |* * * *
+ |* * * *
+ |--------------------------------------------------
+@end example
+
+As you see, the 10000 pixels in the image only have values 1, 2, 3 or 4 (which
were the values in the bins column of @file{histogram.txt}), and the number of
times each of these values occurs follows the @mymath{y=x^2} distribution.
+
+Generally, any value given in the bins column will be used for the final
output values.
+For example, in the command below (for generating a histogram from an
analytical function), we are adding the bins by 20 (while keeping the same
probability distribution of @mymath{y=x^2}).
+If you re-run the Arithmetic command above after this, you will notice that
the pixels values are now one of the following 21, 22, 23 or 24 (instead of 1,
2, 3, or 4).
+But the shape of the histogram of the resulting random distribution will be
unchanged.
@example
-size_t i;
-long a[100];
-float b[100], out[100];
-for(i=0;i<100;++i) out[i]=a[i]+b[i];
+$ echo "" | awk '@{for(i=1;i<5;++i) print 20+i, i*i@}' \
+ > histogram.txt
@end example
-@noindent
-Relying on the default C type conversion significantly speeds up the
processing and also requires less RAM (when using very large images).
+If you do not want the outputs to have exactly the value of the bin
identifier, but be a randomly selected value from a uniform distribution within
the bin, you should use @command{random-from-hist} (see below).
-Some operators can only work on integer types (of any length, for example,
bitwise operators) while others only work on floating point types, (currently
only the @code{pow} operator).
-In such cases, if the operand type(s) are different, an error will be printed.
-Arithmetic also comes with internal type conversion operators which you can
use to convert the data into the appropriate type, see @ref{Arithmetic
operators}.
+As mentioned above, the output will have a double-precision floating point
type (see @ref{Numeric data types}).
+Therefore, by default each element of the output will consume 8 bytes
(64-bits) of storage.
+This is usually far more than the statistical error/precision of your data
(and just results in wasted storage in your file system, or wasted RAM when a
program that uses the data is being run, and a slower running time of the
program).
-@cindex Options
-The hyphen (@command{-}) can be used both to specify options (see
@ref{Options}) and also to specify a negative number which might be necessary
in your arithmetic.
-In order to enable you to do this, Arithmetic will first parse all the input
strings and if the first character after a hyphen is a digit, then that hyphen
is temporarily replaced by the vertical tab character which is not commonly
used.
-The arguments are then parsed and these strings will not be specified as an
option.
-Then the given arguments are parsed and any vertical tabs are replaced back
with a hyphen so they can be read as negative numbers.
-Therefore, as long as the names of the files you want to work on, do not start
with a vertical tab followed by a digit, there is no problem.
-An important consequence of this implementation is that you should not write
negative fractions like this: @command{-.3}, instead write them as
@command{-0.3}.
+It is therefore recommended to use a type-conversion operator after this
operator to put the output in the smallest type that can be used to safely
store your data without wasting storage, RAM or time.
+For the list of type conversion operators, see @ref{Numerical type conversion
operators}.
+Recall that you already know the values returned by this operator (they are
one of the values in the bins column).
-@cindex AWK
-@cindex GNU AWK
-Without any images, Arithmetic will act like a simple calculator and print the
resulting output number on the standard output like the first example above.
-If you really want such calculator operations on the command-line, AWK (GNU
AWK is the most common implementation) is much faster, easier and much more
powerful.
-For example, the numerical one-line example above can be done with the
following command.
-In general AWK is a fantastic tool and GNU AWK has a wonderful manual
(@url{https://www.gnu.org/software/gawk/manual/}).
-So if you often confront situations like this, or have to work with large text
tables/catalogs, be sure to checkout AWK and simplify your life.
+For example, in the example above, the whole image only has values 1, 2, 3 or
4.
+Since they are always positive and are below 255, we can safely place them in
an unsigned 8-bit integer (see @ref{Numeric data types}) with the command below
(note the @code{uint8} after the operator name, and that we are using a
different name for the output).
+After building the new image, let's have a look at the sizes of the two images
with @command{ls -l}:
@example
-$ echo "" | awk '@{print (10.32-3.84)^2.7@}'
-155.329
+$ astarithmetic 100 100 2 makenew \
+ load-col-1-from-histogram.txt \
+ load-col-2-from-histogram.txt \
+ random-from-hist-raw uint8 \
+ --output=random-u8.fits
+
+$ ls -lh random.fits random-u8.fits
+-rw-r--r-- 1 name name 85K Jan 01 13:40 random.fits
+-rw-r--r-- 1 name name 17K Jan 01 13:45 random-u8.fits
@end example
+As you see, when using a suitable data type, we can shrink the size of the
file significantly without loosing any information (from 85 kilobytes to 17
kilobytes).
+This difference can be felt much better for larger (real-world) datasets, so
be sure to always set the output data type after calling this operator.
+@item random-from-hist
+Similar to @code{random-from-hist-raw}, but do not return the exact bin value,
instead return a random value from a uniform distribution within each bin.
+Therefore the following limitations have to be taken into account (compared to
@code{random-from-hist-raw}):
+@itemize
+@item
+The number associated with each bin (in the bin column) should be its center.
+@item
+The bins have to be in descending order (so the second row in the bin column
is larger than the first).
+@item
+The bin widths (distance from one bin to another) have to be fixed.
+@end itemize
+For a demonstration, let's replace @code{random-from-hist-raw} with
@code{random-from-hist} in the example of the description of
@code{random-from-hist-raw}.
+Note how we are manually converting the output of this operator into
single-precision floating point (32-bit, since the default 64-bit precision is
statistically meaningless in this scenario and we do not want to waste storage,
memory and running time):
+@example
+$ echo "" | awk '@{for(i=1;i<5;++i) print i, i*i@}' \
+ > histogram.txt
+$ astarithmetic 100 100 2 makenew \
+ load-col-1-from-histogram.txt \
+ load-col-2-from-histogram.txt \
+ random-from-hist float32 \
+ --output=random.fits
+$ aststatistics random.fits --asciihist --numasciibins=50
+ | *
+ | *** ********
+ | ************
+ | *************
+ | * * *************
+ | * ***********************
+ | *************************
+ | *************************
+ | *************************************
+ |********* * **************************************
+ |**************************************************
+ |--------------------------------------------------
+@end example
+You can see that the pixels of @file{histogram.fits} are no longer just 1, 2,
3 or 4.
+Instead, the values within each bin are selected from a uniform distribution
covering that bin.
+This creates the step-like feature in the histogram of the output.
+Of course, this extra uniform random number generation can make your program
slower so be sure to check if it is worth it.
+In particular, one way to avoid this (and use @command{random-from-hist-raw}
with a more contiguous-looking output distribution) is to simply use a
higher-resolution histogram (assuming it is possible: you have a sufficient
number of data points, or you have an analytical expression that you can sample
at smaller bin sizes).
+To better demonstrate this operator and its practical usage in everyday
research, let's look at another example:
+Assume you want to get 100 random star magnitudes that follow the real-world
Gaia Data release 3 magnitude distribution within a radius of 2 degrees around
the (RA,Dec) coordinate of (1.23,4.56).
+Let's further assume that you want to distribute them uniformly over an image
of size 1000 by 1000 pixels.
+So your desired output table should have three columns, the first two are
pixel positions of each star, and the third is the magnitude.
+First, we need to query the Gaia database and ask for all the magnitudes in
this region of the sky.
+We know that Gaia is not complete for stars fainter than the 20th magnitude,
so we will use the @option{--range} option and only ask for those stars that
are brighter than magnitude 20.
+@example
+$ astquery gaia --dataset=dr3 --center=1.23,3.45 --radius=2 \
+ --column=phot_g_mean_mag --output=gaia.fits \
+ --range=phot_g_mean_mag,-inf,20
+@end example
+We now have more than 25000 magnitudes in @file{gaia.fits}!
+To get a more accurate random sampling of our stars, let's construct a
histogram with 500 bins, and generate our three desired randomly selected
columns:
+@example
+$ aststatistics gaia.fits --histogram --numbins=500 \
+ --output=gaia-hist.fits
+$ asttable gaia-hist.fits -i
-@node Convolve, Warp, Arithmetic, Data manipulation
-@section Convolve
+$ echo 1000 \
+ | awk '@{for(i=0;i<100;++i) print $1/2@}' \
+ | asttable -c'arith $1 500 mknoise-uniform' \
+ -c'arith $1 500 mknoise-uniform' \
+ -c'arith $1 \
+ load-col-1-from-gaia-hist.fits-hdu-1 \
+ load-col-2-from-gaia-hist.fits-hdu-1 \
+ random-from-hist float32'
+@end example
-@cindex Convolution
-@cindex Neighborhood
-@cindex Weighted average
-@cindex Average, weighted
-@cindex Kernel, convolution
-On an image, convolution can be thought of as a process to blur or remove the
contrast in an image.
-If you are already familiar with the concept and just want to run Convolve,
you can jump to @ref{Convolution kernel} and @ref{Invoking astconvolve} and
skip the lengthy introduction on the basic definitions and concepts of
convolution.
+These columns can easily be placed in the format for @ref{MakeProfiles} to be
inserted into an image automatically.
+@end table
-There are generally two methods to convolve an image.
-The first and more intuitive one is in the ``spatial domain'' or using the
actual image pixel values, see @ref{Spatial domain convolution}.
-The second method is when we manipulate the ``frequency domain'', or work on
the magnitudes of the different frequencies that constitute the image, see
@ref{Frequency domain and Fourier operations}.
-Understanding convolution in the spatial domain is more intuitive and thus
recommended if you are just starting to learn about convolution.
-However, getting a good grasp of the frequency domain is a little more
involved and needs some concentration and some mathematical proofs.
-However, its reward is a faster operation and more importantly a very
fundamental understanding of this very important operation.
+@node Box shape operators, Loading external columns, Random number generators,
Arithmetic operators
+@subsubsection Box shape operators
-@cindex Detection
-@cindex Atmosphere
-@cindex Blur image
-@cindex Cosmic rays
-@cindex Pixel mixing
-@cindex Mixing pixel values
-Convolution of an image will generally result in blurring the image because it
mixes pixel values.
-In other words, if the image has sharp differences in neighboring pixel
values@footnote{In astronomy, the only major time we confront such sharp
borders in signal are cosmic rays.
-All other sources of signal in an image are already blurred by the atmosphere
or the optics of the instrument.}, those sharp differences will become smoother.
-This has very good consequences in detection of signal in noise for example.
-In an actual observed image, the variation in neighboring pixel values due to
noise can be very high.
-But after convolution, those variations will decrease and we have a better
hope in detecting the possible underlying signal.
-Another case where convolution is extensively used is in mock images and
modeling in general, convolution can be used to simulate the effect of the
atmosphere or the optical system on the mock profiles that we create, see
@ref{PSF}.
-Convolution is a very interesting and important topic in any form of signal
analysis (including astronomical observations).
-So we have thoroughly@footnote{A mathematician will certainly consider this
explanation is incomplete and inaccurate.
-However this text is written for an understanding on the operations that are
done on a real (not complex, discrete and noisy) astronomical image, not any
general form of abstract function} explained the concepts behind it in the
following sub-sections.
+The operators here help you in defining or using coordinates that form a
``box'' (a rectangular region).
-@menu
-* Spatial domain convolution:: Only using the input image values.
-* Frequency domain and Fourier operations:: Using frequencies in input.
-* Spatial vs. Frequency domain:: When to use which?
-* Convolution kernel:: How to specify the convolution kernel.
-* Invoking astconvolve:: Options and argument to Convolve.
-@end menu
+@table @command
+@item box-around-ellipse
+Return the width (along horizontal) and height (along vertical) of a box that
encompasses an ellipse with the same center point.
+The top-popped operand is assumed to be the position angle (angle from the
horizontal axis) in @emph{degrees}.
+The second and third popped operands are the minor and major radii of the
ellipse respectively.
+This operator outputs two operands on the general stack.
+The first one is the width and the second (which will be the top one when this
operator finishes) is the height.
-@node Spatial domain convolution, Frequency domain and Fourier operations,
Convolve, Convolve
-@subsection Spatial domain convolution
+If the value to the second popped operand (minor axis) is larger than the
third (major axis), a NaN value will be written for both the width and height
of that element and a warning will be printed (the warning can be disabled with
the @option{--quiet} option).
-The pixels in an input image represent different ``spatial'' positions,
therefore when convolution is done only using the actual input pixel values, we
name the process as being done in the ``Spatial domain''.
-In particular this is in contrast to the ``frequency domain'' that we will
discuss later in @ref{Frequency domain and Fourier operations}.
-In the spatial domain (and in realistic situations where the image and the
convolution kernel do not extend to infinity), convolution is the process of
changing the value of one pixel to the @emph{weighted} average of all the
pixels in its @emph{neighborhood}.
+As an example, if your ellipse has a major axis radius of 10 units, a minor
axis radius of 4 units and a position angle of 20 degrees, you can estimate the
bounding box with this command:
-The `neighborhood' of each pixel (how many pixels in which direction) and the
`weight' function (how much each neighboring pixel should contribute depending
on its position) are given through a second image which is known as a
``kernel''@footnote{Also known as filter, here we will use `kernel'.}.
+@example
+$ echo "10 4 20" \
+ | asttable -c'arith $1 $2 $3 box-around-ellipse'
+@end example
-@menu
-* Convolution process:: More basic explanations.
-* Edges in the spatial domain:: Dealing with the edges of an image.
-@end menu
+Alternatively if your three values are in separate FITS arrays/images, you can
use the command below to have the width and height in similarly sized fits
arrays.
+In this example @file{a.fits} and @file{b.fits} are respectively the major and
minor axis lengths and @file{pa.fits} is the position angle (in degrees).
+Also, in all three, we assume the first extension is used.
+After it is done, the height of the box will be put in @file{h.fits} and the
width will be in @file{w.fits}.
+Just note that because this operator has two output datasets, you need to
first write the height (top output operand) into a file and free it with the
@code{tofilefree-} operator, then write the width in the file given to
@option{--output}.
-@node Convolution process, Edges in the spatial domain, Spatial domain
convolution, Spatial domain convolution
-@subsubsection Convolution process
+@example
+$ astarithmetic a.fits b.fits pa.fits box-around-ellipse \
+ tofilefree-h.fits -ow.fits -g1
+@end example
-In convolution, the kernel specifies the weight and positions of the neighbors
of each pixel.
-To find the convolved value of a pixel, the central pixel of the kernel is
placed on that pixel.
-The values of each overlapping pixel in the kernel and image are multiplied by
each other and summed for all the kernel pixels.
-To have one pixel in the center, the sides of the convolution kernel have to
be an odd number.
-This process effectively mixes the pixel values of each pixel with its
neighbors, resulting in a blurred image compared to the sharper input image.
+Finally, if you need to treat the width and height separately for further
processing, you can call the @code{set-} operator two times afterwards like
below.
+Recall that the @code{set-} operator will pop the top operand, and put it in
memory with a certain name, bringing the next operand to the top of the stack.
-@cindex Linear spatial filtering
-Formally, convolution is one kind of linear `spatial filtering' in image
processing texts.
-If we assume that the kernel has @mymath{2a+1} and @mymath{2b+1} pixels on
each side, the convolved value of a pixel placed at @mymath{x} and @mymath{y}
(@mymath{C_{x,y}}) can be calculated from the neighboring pixel values in the
input image (@mymath{I}) and the kernel (@mymath{K}) from
+For example, let's assume @file{catalog.fits} has at least three columns
@code{MAJOR}, @code{MINOR} and @code{PA} which specify the major axis, minor
axis and position angle respectively.
+But you want the final width and height in 32-bit floating point numbers (not
the default 64-bit, which may be too much precision in many scenarios).
+You can do this with the command below (note you can also break lines with
@key{\}, within the single-quote environment)
-@dispmath{C_{x,y}=\sum_{s=-a}^{a}\sum_{t=-b}^{b}K_{s,t}\times{}I_{x+s,y+t}.}
+@example
+$ asttable catalog.fits \
+ -c'arith MAJOR MINOR PA box-around-ellipse \
+ set-height set-width \
+ width float32 height float32'
+@end example
-@cindex Correlation
-@cindex Convolution
-Formally, any pixel that is outside of the image in the equation above will be
considered to be zero (although, see @ref{Edges in the spatial domain}).
-When the kernel is symmetric about its center the blurred image has the same
orientation as the original image.
-However, if the kernel is not symmetric, the image will be affected in the
opposite manner, this is a natural consequence of the definition of spatial
filtering.
-In order to avoid this we can rotate the kernel about its center by 180
degrees so the convolved output can have the same original orientation (this is
done by default in the Convolve program).
-Technically speaking, only if the kernel is flipped the process is known as
@emph{Convolution}.
-If it is not it is known as @emph{Correlation}.
+@item box-vertices-on-sphere
+@cindex Polygon
+@cindex Vertices on sphere (sky)
+Convert a box center and width to the coordinates of the vertices of the box
on a left-hand spherical coordinate system.
+In a left-handed spherical coordinate system, the longitude increases towards
the left while north is up (as in the RA and Dec direction of the equatorial
coordinate system used in astronomy).
+This operator therefore takes four input operands (the RA and Dec of the box's
center, as well as the width of the box in each direction).
-To be a weighted average, the sum of the weights (the pixels in the kernel)
has to be unity.
-This will have the consequence that the convolved image of an object and
unconvolved object will have the same brightness (see @ref{Brightness flux
magnitude}), which is natural, because convolution should not eat up the object
photons, it only disperses them.
+After it is complete, this operator places 8 operands on the stack which
contain the RA and Dec of the four vertices of the box in the following
anti-clockwise order:
+@enumerate
+@item
+Bottom-left vertice Longitude (RA)
+@item
+Bottom-left vertice Latitude (Dec)
+@item
+Bottom-right vertice Longitude (RA)
+@item
+Bottom-right vertice Latitude (Dec)
+@item
+Top-right vertice Longitude (RA)
+@item
+Top-right vertice Latitude (Dec)
+@item
+Top-left vertice Longitude (RA)
+@item
+Top-left vertice Latitude (Dec)
+@end enumerate
-The convolution of each pixel is independent of the other pixels, and in some
cases, it may be necessary to convolve different parts of an image separately
(for example, when you have different amplifiers on the CCD).
-Therefore, to speed up spatial convolution, Gnuastro first defines a
tessellation over the input; assigning each group of pixels to ``tiles''.
-It then does the convolution in parallel on each tile.
-For more on how Gnuastro's programs create the tile grid (tessellation), see
@ref{Tessellation}.
+For example, with the command below, we will retrieve the vertice coordinates
of a rectangle around a point with RA=20 and Dec=0 (on the equator).
+The rectangle will have a 1 degree edge along the RA direction and a 2 degree
edge along the declination.
+In this example, we are using the @option{-Afixed -B2} only for demonstration
purposes here due to the round numbers!
+In general, it is best to write your outputs to a binary FITS table to
preserve the full precision (see @ref{Printing floating point numbers}).
+@example
+$ echo "20 0 1 2" \
+ | asttable -Afixed -B2 \
+ -c'arith $1 $2 $3 $4 box-vertices-on-sphere'
+20.50 -1.00 19.50 -1.00 19.50 1.00 20.50 1.00
+@end example
+We see that the bottom-left vertice is at (RA,Dec) of @mymath{(20.50,-1.0)}
and the top-right vertice is at @mymath{(19.50,1.00)}.
+These could have easily been done by manually adding and subtracting!
+But you will see that the complexity arises at higher/lower declinations.
+For example, with the command below, let's see how vertice coordinates of the
same box, but after moving its center to (RA,Dec) of (20,85):
-@node Edges in the spatial domain, , Convolution process, Spatial domain
convolution
-@subsubsection Edges in the spatial domain
+@example
+$ echo "20 85 1 2" \
+ | asttable -Afixed -B2 \
+ -c'arith $1 $2 $3 $4 box-vertices-on-sphere'
+24.78 84.00 15.22 84.00 12.83 86.00 27.17 86.00
+@end example
-In purely `linear' spatial filtering (convolution), there are problems with
the edges of the input image.
-Here we will explain the problem in the spatial domain.
-For a discussion of this problem from the frequency domain perspective, see
@ref{Edges in the frequency domain}.
-The problem originates from the fact that on the edges, in practice, the sum
of the weights we use on the actual image pixels is not unity@footnote{Because
we assumed the overlapping pixels outside the input image have a value of
zero.}.
-For example, as discussed above, a profile in the center of an image will have
the same brightness before and after convolution.
-However, for partially imaged profile on the edge of the image, the brightness
(sum of its pixel fluxes within the image, see @ref{Brightness flux magnitude})
will not be equal, some of the flux is going to be `eaten' by the edges.
-
-If you run @command{$ make check} on the source files of Gnuastro, you can see
this effect by comparing the @file{convolve_frequency.fits} with
@file{convolve_spatial.fits} in the @file{./tests/} directory.
-In the spatial domain, by default, no assumption will be made about pixels
outside of the image or any blank pixels in the image.
-The problem explained above will also occur on the sides of blank regions (see
@ref{Blank pixels}).
-The solution to this edge effect problem is only possible in the spatial
domain.
-For pixels near the edge, we have to abandon the assumption that the sum of
the kernel pixels is unity during the convolution process@footnote{Of course
the sum of the kernel pixels still have to be unity in general.}.
-So taking @mymath{W} as the sum of the kernel pixels that overlapped with
non-blank and in-image pixels, the equation in @ref{Convolution process} will
become:
-
-@dispmath{C_{x,y}= { \sum_{s=-a}^{a}\sum_{t=-b}^{b}K_{s,t}\times{}I_{x+s,y+t}
\over W}.}
+Even though, we didn't change the central RA (20) or the size of the box
along the RA (1 degree), the RA of the bottom-left vertice is now at 24.78;
almost 5 degrees away!
+This occurs because of the spherical coordinate system, we measure the
longitude (e.g., RA) with the following way:
+@enumerate
+@item
+@cindex Meridian
+@cindex Great circle
+@cindex Circle (great)
+Draw a meridian that passes your point.
+The meridian is half of a @url{https://en.wikipedia.org/wiki/Great_circle,
great-circle} (which has a diameter that is equal to the sphere's diameter)
passes both poles.
+@item
+Find the intersection of that meridian with the equator.
+@item
+The distance of the intersection and the reference point (along the equator)
defines the longitude angle.
+@end enumerate
-@noindent
-In this manner, objects which are near the edges of the image or blank pixels
will also have the same brightness (within the image) before and after
convolution.
-This correction is applied by default in Convolve when convolving in the
spatial domain.
-To disable it, you can use the @option{--noedgecorrection} option.
-In the frequency domain, there is no way to avoid this loss of flux near the
edges of the image, see @ref{Edges in the frequency domain} for an
interpretation from the frequency domain perspective.
+@cindex Small circle
+@cindex Circle (small)
+As you get more distant from the equator (declination becomes non-zero), any
change along the RA (towards the east; 1 degree in the example above) will on
longer be on a great circle, but along a
``@url{https://en.wikipedia.org/wiki/Circle_of_a_sphere, small circle}''.
+On a small circle that is defined by the fixed declination @mymath{\delta},
the distance of two points is closer than the distances of their projection on
the equator (as described in the definition of longitude above).
+It is smaller by a factor of @mymath{\cos({\delta})}.
-Note that the edge effect discussed here is different from the one in @ref{If
convolving afterwards}.
-In making mock images we want to simulate a real observation.
-In a real observation, the images of the galaxies on the sides of the CCD are
first blurred by the atmosphere and instrument, then imaged.
-So light from the parts of a galaxy which are immediately outside the CCD will
affect the parts of the galaxy which are covered by the CCD.
-Therefore in modeling the observation, we have to convolve an image that is
larger than the input image by exactly half of the convolution kernel.
-We can hence conclude that this correction for the edges is only useful when
working on actual observed images (where we do not have any more data on the
edges) and not in modeling.
+Therefore, an angular change (let's call it @mymath{\Delta_{lon}}) along the
small circle defined by the fixed declination of @mymath{\delta} corresponds to
@mymath{\Delta_{lon}/\cos(\delta)} on the equator.
+@end table
+@node Loading external columns, Size and position operators, Box shape
operators, Arithmetic operators
+@subsubsection Loading external columns
+In the Arithmetic program, you can always load new dataset by simply giving
their name.
+However, they can only be images, not a column.
+In the Table program, you can load columns in @ref{Column arithmetic}, but it
has to be columns within the same table (and thus the same number of rows).
+However, in some situations, it is necessary to use certain columns of a table
in the Arithmetic program, or columns of different rows (from the main input)
in Table.
-@node Frequency domain and Fourier operations, Spatial vs. Frequency domain,
Spatial domain convolution, Convolve
-@subsection Frequency domain and Fourier operations
+@table @command
+@item load-col-%-from-%
+@itemx load-col-%-from-%-hdu-%
+Load the requested column (first @command{%}) from the requested file (second
@command{%}).
+If the file is a FITS file, it is also necessary to specify a HDU using the
second form (where the HDU identifier is the third @command{%}.
+For example, @command{load-col-MAG-from-catalog.fits-hdu-1} will load the
@code{MAG} column from HDU 1 of @code{catalog.fits}.
-Getting a good grip on the frequency domain is usually not an easy job! So we
have decided to give the issue a complete review here.
-Convolution in the frequency domain (see @ref{Convolution theorem}) heavily
relies on the concepts of Fourier transform (@ref{Fourier transform}) and
Fourier series (@ref{Fourier series}) so we will be investigating these
important operations first.
-It has become something of a clich@'e for people to say that the Fourier
series ``is a way to represent a (wave-like) function as the sum of simple sine
waves'' (from Wikipedia).
-However, sines themselves are abstract functions, so this statement really
adds no extra layer of physical insight.
+For example, let's assume you have the following two tables, and you would
like to add the first column of the first with the second:
-Before jumping head-first into the equations and proofs, we will begin with a
historical background to see how the importance of frequencies actually roots
in our ancient desire to see everything in terms of circles.
-A short review of how the complex plane should be interpreted is then given.
-Having paved the way with these two basics, we define the Fourier series and
subsequently the Fourier transform.
-The final aim is to explain discrete Fourier transform, however some very
important concepts need to be solidified first: The Dirac comb, convolution
theorem and sampling theorem.
-So each of these topics are explained in their own separate sub-sub-section
before going on to the discrete Fourier transform.
-Finally we revisit (after @ref{Edges in the spatial domain}) the problem of
convolution on the edges, but this time in the frequency domain.
-Understanding the sampling theorem and the discrete Fourier transform is very
important in order to be able to pull out valuable science from the discrete
image pixels.
-Therefore we have included the mathematical proofs and figures so you can have
a clear understanding of these very important concepts.
+@example
+$ asttable tab-1.fits
+1 43.23
+2 21.91
+3 71.28
+4 18.10
-@menu
-* Fourier series historical background:: Historical background.
-* Circles and the complex plane:: Interpreting complex numbers.
-* Fourier series:: Fourier Series definition.
-* Fourier transform:: Fourier Transform definition.
-* Dirac delta and comb:: Dirac delta and Dirac comb.
-* Convolution theorem:: Derivation of Convolution theorem.
-* Sampling theorem:: Sampling theorem (Nyquist frequency).
-* Discrete Fourier transform:: Derivation and explanation of DFT.
-* Fourier operations in two dimensions:: Extend to 2D images.
-* Edges in the frequency domain:: Interpretation of edge effects.
-@end menu
+$ cat tab-2.txt
+5
+6
+7
+8
-@node Fourier series historical background, Circles and the complex plane,
Frequency domain and Fourier operations, Frequency domain and Fourier operations
-@subsubsection Fourier series historical background
-Ever since the ancient times, the circle has been (and still is) the simplest
shape for abstract comprehension.
-All you need is a center point and a radius and you are done.
-All the points on a circle are at a fixed distance from the center.
-However, the moment you try to connect this elegantly simple and beautiful
abstract construct (the circle) with the real world (for example, compute its
area or its circumference), things become really hard (ideally, impossible)
because the irrational number @mymath{\pi} gets involved.
+$ asttable tab-1.txt -c'arith $1 load-col-1-from-tab-2.txt +'
+6
+8
+10
+12
+@end example
+@end table
-The key to understanding the Fourier series (thus the Fourier transform and
finally the Discrete Fourier Transform) is our ancient desire to express
everything in terms of circles or the most exceptionally simple and elegant
abstract human construct.
-Most people prefer to say the same thing in a more ahistorical manner: to
break a function into sines and cosines.
-As the term ``ancient'' in the previous sentence implies, Jean-Baptiste Joseph
Fourier (1768 -- 1830 A.D.) was not the first person to do this.
-The main reason we know this process by his name today is that he came up with
an ingenious method to find the necessary coefficients (radius of) and
frequencies (``speed'' of rotation on) the circles for any generic (integrable)
function.
+@node Size and position operators, Building new dataset and stack management,
Loading external columns, Arithmetic operators
+@subsubsection Size and position operators
-@float Figure,epicycle
+With the operators below you can get metadata about the top dataset on the
stack.
-@c Since these links are long, we had to write them like this so they do not
-@c jump out of the text width.
-@cindex Qutb al-Din al-Shirazi
-@cindex al-Shirazi, Qutb al-Din
-@image{gnuastro-figures/epicycles, 15.2cm, , Middle ages epicycles along with
two demonstrations of breaking a generic function using epicycles.}
-@caption{Epicycles and the Fourier series.
-Left: A demonstration of Mercury's epicycles relative to the ``center of the
world'' by Qutb al-Din al-Shirazi (1236 -- 1311 A.D.) retrieved
@url{https://commons.wikimedia.org/wiki/File:Ghotb2.jpg, from Wikipedia}.
-@url{https://commons.wikimedia.org/wiki/File:Fourier_series_square_wave_circles_animation.gif,
Middle} and
-Right: How adding more epicycles (or terms in the Fourier series) will
approximate functions.
-The
@url{https://commons.wikimedia.org/wiki/File:Fourier_series_sawtooth_wave_circles_animation.gif,
right} animation is also available.}
-@end float
+@table @code
+@item index
+Add a new operand to the stack with an integer type and the same size (in all
dimensions) as top operand on the stack (before it was called; it is not
popped!).
+The first pixel in the returned operand is zero, and every later pixel's value
is incremented by one.
+It is important to remember that the top operand is not popped by this
operand, so it remains on the stack.
+After this operand is finished, it adds a new operand to the stack.
+To pop the previous operand, you can use the @code{indexonly} operator.
-Like most aspects of mathematics, this process of interpreting everything in
terms of circles, began for astronomical purposes.
-When astronomers noticed that the orbit of Mars and other outer planets, did
not appear to be a simple circle (as everything should have been in the
heavens).
-At some point during their orbit, the revolution of these planets would become
slower, stop, go back a little (in what is known as the retrograde motion) and
then continue going forward again.
+The data type of the output is always an unsigned integer, and its width is
determined from the number of pixels/rows in the top operand.
+For example if there are only 108 rows in a table, the returned column will
have an unsigned 8-bit integer type (that can keep 256 separate values).
+But if the top operand is a @mymath{1000\times1000=10^6} pixel image, the
output will be a 32-bit unsigned integer.
+For the various types of integers, see @ref{Numeric data types}.
-The correction proposed by Ptolemy (90 -- 168 A.D.) was the most agreed upon.
-He put the planets on Epicycles or circles whose center itself rotates on a
circle whose center is the earth.
-Eventually, as observations became more and more precise, it was necessary to
add more and more epicycles in order to explain the complex motions of the
planets@footnote{See the Wikipedia page on ``Deferent and epicycle'' for a more
complete historical review.}.
-@ref{epicycle}(Left) shows an example depiction of the epicycles of Mercury in
the late 13th century.
+To see the index image along with the actual image, you can use the
@option{--writeall} operator to have a multi-HDU output (without
@option{--writeall}, Arithmetic will complain if more than one operand is left
at the end).
+After DS9 opens with the second command, flip between the two extensions.
-@cindex Aristarchus of Samos
-Of course we now know that if they had abdicated the Earth from its throne in
the center of the heavens and allowed the Sun to take its place, everything
would become much simpler and true.
-But there was not enough observational evidence for changing the
``professional consensus'' of the time to this radical view suggested by a
small minority@footnote{Aristarchus of Samos (310 -- 230 B.C.) appears to be
one of the first people to suggest the Sun being in the center of the universe.
-This approach to science (that the standard model is defined by consensus) and
the fact that this consensus might be completely wrong still applies equally
well to our models of particle physics and cosmology today.}.
-So the pre-Galilean astronomers chose to keep Earth in the center and find a
correction to the models (while keeping the heavens a purely ``circular''
order).
+@example
+$ astarithmetic image.fits index --writeall
+$ astscript-fits-view image_arith.fits
+@end example
-The main reason we are giving this historical background which might appear
off topic is to give historical evidence that while such ``approximations'' do
work and are very useful for pragmatic reasons (like measuring the calendar
from the movement of astronomical bodies).
-They offer no physical insight.
-The astronomers who were involved with the Ptolemaic world view had to add a
huge number of epicycles during the centuries after Ptolemy in order to explain
more accurate observations.
-Finally the death knell of this world-view was Galileo's observations with his
new instrument (the telescope).
-So the physical insight, which is what Astronomers and Physicists are
interested in (as opposed to Mathematicians and Engineers who just like proving
and optimizing or calculating!) comes from being creative and not limiting
ourselves to such approximations.
-Even when they work.
+Below is a review some usage examples of this operator:
-@node Circles and the complex plane, Fourier series, Fourier series historical
background, Frequency domain and Fourier operations
-@subsubsection Circles and the complex plane
-Before going onto the derivation, it is also useful to review how the complex
numbers and their plane relate to the circles we talked about above.
-The two schematics in the middle and right of @ref{epicycle} show how a 1D
function of time can be made using the 2D real and imaginary surface.
-Seeing the animation in Wikipedia will really help in understanding this
important concept.
-At each point in time, we take the vertical coordinate of the point and use it
to find the value of the function at that point in time.
-@ref{iandtime} shows this relation with the axes marked.
+@table @asis
+@item Image: masking margins
+With the command below, we will be masking all pixels that are 20 pixels away
from the edges of the image (on the margin).
+Here is a description of the command below (for the basics of Arithmetic's
notation, see @ref{Reverse polish notation}):
+@itemize
+@item
+The @code{index} operator just adds a new dataset on the stack: unlike almost
all other operators in Arithmetic, @code{index} doesn't remove its input
dataset from the stack (use @code{indexonly} for the ``normal'' behavior).
+This is because @code{index} returns the pixel metadata not data.
+As a result, after @code{index}, we have two operands on the stack: the input
image and the index image.
+@item
+With the @code{set-i} operator, the top operand (the image containing the
index of each pixel) is popped from the stack and associated to the name
@code{i}.
+Therefore after this, the stack only has the input image.
+For more on the @code{set-} operator, see @ref{Operand storage in memory or a
file}.
+@item
+We need three values from the commands before Arithmetic (for the width and
height of the image and the size of the margin).
+To make the rest of the command easier to read/use, we'll define them in
Arithmetic as three named operators (respectively called @code{w}, @code{h} and
@code{m}).
+All three are integers that will have a positive value lower than
@mymath{2^{16}=65536} (for a ``normal'' image!).
+Therefore, we will store them as 16-bit unsigned integers with the
@code{uint16} operator (this will help optimal processing in later steps).
+For more the type changing operators, see @ref{Numerical type conversion
operators}.
+@item
+Using the modulo @code{%} and division (@code{/}) operators on the index image
and the width, we extract the horizontal (X) and vertical (Y) positions of each
pixel in separately named operands called @code{X} and @code{Y}.
+The maximum value in these two will also fit within an unsigned 16-bit
integer, so we'll also store these in that type.
+@item
+For the horizontal (X) dimension, we select pixels that are less than the
margin (@code{X m lt}) and those that are more than the width subtracted by the
margin (@code{X w m - gt}).
+@item
+The output of the @code{lt} and @code{gt} conditional operators above is a
binary (0 or 1 valued) image.
+We therefore merge them into one binary image using the @code{or} operator.
+For more, see @ref{Conditional operators}.
+@item
+We repeat the two steps above for the vertical (Y) dimension.
+@item
+Once the images containing the to-be-masked pixels in each dimension are made,
we combine them into one binary image with a final @code{or} operator.
+At this point, the stack only has two operands: 1) the input image and 2) the
binary image that has a value of 1 for all pixels whose value should be changed.
+@item
+A single-element operand (@code{nan}) is added on the stack.
+@item
+Using the @code{where} operator, we replace all the pixels that are non-zero
in the second operand (on the margins) to the top operand's value (NaN) in the
third popped operand (image that was read from @code{image.fits}).
+For more on the @code{where} operator, see @ref{Conditional operators}.
+@end itemize
-@cindex Roger Cotes
-@cindex Cotes, Roger
-@cindex Caspar Wessel
-@cindex Wassel, Caspar
-@cindex Leonhard Euler
-@cindex Euler, Leonhard
-@cindex Abraham de Moivre
-@cindex de Moivre, Abraham
-Leonhard Euler@footnote{Other forms of this equation were known before Euler.
-For example, in 1707 A.D. (the year of Euler's birth) Abraham de Moivre (1667
-- 1754 A.D.) showed that @mymath{(\cos{x}+i\sin{x})^n=\cos(nx)+i\sin(nx)}.
-In 1714 A.D., Roger Cotes (1682 -- 1716 A.D. a colleague of Newton who
proofread the second edition of Principia) showed that:
@mymath{ix=\ln(\cos{x}+i\sin{x})}.} (1707 -- 1783 A.D.) showed that the
complex exponential (@mymath{e^{iv}} where @mymath{v} is real) is periodic and
can be written as: @mymath{e^{iv}=\cos{v}+isin{v}}.
-Therefore @mymath{e^{iv+2\pi}=e^{iv}}.
-Later, Caspar Wessel (mathematician and cartographer 1745 -- 1818 A.D.)
showed how complex numbers can be displayed as vectors on a plane.
-Euler's identity might seem counter intuitive at first, so we will try to
explain it geometrically (for deeper physical insight).
-On the real-imaginary 2D plane (like the left hand plot in each box of
@ref{iandtime}), multiplying a number by @mymath{i} can be interpreted as
rotating the point by @mymath{90} degrees (for example, the value @mymath{3} on
the real axis becomes @mymath{3i} on the imaginary axis).
-On the other hand, @mymath{e\equiv\lim_{n\rightarrow\infty}(1+{1\over n})^n},
therefore, defining @mymath{m\equiv nu}, we get:
+@example
+$ margin=20
+$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
+$ height=$(astfits image.fits --keyvalue=NAXIS2 -q)
+$ astarithmetic image.fits index set-i \
+ $width uint16 set-w \
+ $height uint16 set-h \
+ $margin uint16 set-m \
+ i w % uint16 set-X \
+ i w / uint16 set-Y \
+ X m lt X w m - gt or \
+ Y m lt Y h m - gt or \
+ or nan where
+@end example
-@dispmath{e^{u}=\lim_{n\rightarrow\infty}\left(1+{1\over n}\right)^{nu}
- =\lim_{n\rightarrow\infty}\left(1+{u\over nu}\right)^{nu}
- =\lim_{m\rightarrow\infty}\left(1+{u\over m}\right)^{m}}
+@item Image: Masking regions outside a circle
+As another example for usage on an image, in the command below we are using
@code{index} to define an image where each pixel contains the distance to the
pixel with X,Y coordinates of 345,250.
+We are then using that distance image to only keep the pixels that are within
a 50 pixel radius of that point.
-@noindent
-Taking @mymath{u\equiv iv} the result can be written as a generic complex
number (a function of @mymath{v}):
+The basic concept behind this process is very similar to the previous example,
with a different mathematical definition for pixels to mask.
+The major difference is that we want the distance to a pixel within the image,
we need to have negative values and the center coordinates can be in a
sub-pixel positions.
+The best numeric datatype for intermediate steps is therefore floating point.
+64-bit floating point can have a precision of up to 15 digits after the
decimal point.
+This is far too much for what we need here: in astronomical imaging, the PSF
is usually on the scale of 1 or more pixels (see @ref{Sampling theorem}).
+So even reaching a precision of one millionth of a pixel (offered by 32-bit
floating points) is beyond our wildest dreams (see @ref{Numeric data types}).
+We will also define the horizontal (X) and vertical (Y) operands after
shifting to the desired central point.
-@dispmath{e^{iv}=\lim_{m\rightarrow\infty}\left(1+i{v\over
- m}\right)^{m}=a(v)+ib(v)}
+@example
+$ radius=50
+$ centerx=345.2
+$ centery=250.3
+$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
+$ astarithmetic image.fits index set-i \
+ $width uint16 set-w \
+ $radius float32 set-r \
+ $centerx float32 set-cx \
+ $centery float32 set-cy \
+ i w % cx - set-X \
+ i w / cy - set-Y \
+ X X x Y Y x + sqrt r gt \
+ nan where --output=arith-masked.fits
+@end example
+@cartouche
@noindent
-For @mymath{v=\pi}, a nice geometric animation of going to the limit can be
seen @url{https://commons.wikimedia.org/wiki/File:ExpIPi.gif, on Wikipedia}.
-We see that @mymath{\lim_{m\rightarrow\infty}a(\pi)=-1}, while
@mymath{\lim_{m\rightarrow\infty}b(\pi)=0}, which gives the famous
@mymath{e^{i\pi}=-1} equation.
-The final value is the real number @mymath{-1}, however the distance of the
polygon points traversed as @mymath{m\rightarrow\infty} is half the
circumference of a circle or @mymath{\pi}, showing how @mymath{v} in the
equation above can be interpreted as an angle in units of radians and therefore
how @mymath{a(v)=cos(v)} and @mymath{b(v)=sin(v)}.
+@strong{Optimal data types have significant benefits:} choosing the minimum
required datatype for your operation is very important to avoid wasting your
CPU and RAM.
+Don't simply default to 64-bit floating points for everything!
+Integer operations are much faster than floating points, and within floating
point types, 32-bit is faster and will use half the RAM/storage!
+For more, see @ref{Numeric data types}.
+@end cartouche
-Since @mymath{e^{iv}} is periodic (let's assume with a period of @mymath{T}),
it is more clear to write it as @mymath{v\equiv{2{\pi}n\over T}t} (where
@mymath{n} is an integer), so @mymath{e^{iv}=e^{i{2{\pi}n\over T}t}}.
-The advantage of this notation is that the period (@mymath{T}) is clearly
visible and the frequency (@mymath{2{\pi}n \over T}, in units of 1/cycle) is
defined through the integer @mymath{n}.
-In this notation, @mymath{t} is in units of ``cycle''s.
+The example above was just a demo for usage of the @code{index} operator and
some important concepts.
+But it is not the easiest way to achieve the desired result above!
+An easier way for the scenario above (to keep a circle within an image and set
everything else to NaN) is to use MakeProfiles in combination with Arithmetic,
like below:
-As we see from the examples in @ref{epicycle} and @ref{iandtime}, for each
constituting frequency, we need a respective `magnitude' or the radius of the
circle in order to accurately approximate the desired 1D function.
-The concepts of ``period'' and ``frequency'' are relatively easy to grasp when
using temporal units like time because this is how we define them in every-day
life.
-However, in an image (astronomical data), we are dealing with spatial units
like distance.
-Therefore, by one ``period'' we mean the @emph{distance} at which the signal
is identical and frequency is defined as the inverse of that spatial ``period''.
-The complex circle of @ref{iandtime} can be thought of the Moon rotating about
Earth which is rotating around the Sun; so the ``Real (signal)'' axis shows the
Moon's position as seen by a distant observer on the Sun as time goes by.
-Because of the scalar (not having any direction or vector) nature of time,
@ref{iandtime} is easier to understand in units of time.
-When thinking about spatial units, mentally replace the ``Time (sec)'' axis
with ``Distance (meters)''.
-Because length has direction and is a vector, visualizing the rotation of the
imaginary circle and the advance along the ``Distance (meters)'' axis is not as
simple as temporal units like time.
+@example
+$ radius=50
+$ centerx=345.2
+$ centery=250.3
+$ echo "1 $centerx $centery 5 $radius 0 0 1 1 1" \
+ | astmkprof --background=image.fits \
+ --mforflatpix --clearcanvas \
+ -omkprof-mask.fits --type=uint8
+$ astarithmetic image.fits mkprof-mask.fits not \
+ nan where -g1 -omkprof-masked.fits
+@end example
-@float Figure,iandtime
-@image{gnuastro-figures/iandtime, 15.2cm, , }
-@caption{Relation between the real (signal), imaginary
(@mymath{i\equiv\sqrt{-1}}) and time axes at two snapshots of time.}
-@end float
+@item Tables: adding new columns with row index
+Within Table, you can use this operator to add an index column like below (see
the @code{counter} operator for starting the count from one).
+
+@example
+## The index will be the second column.
+$ asttable table.fits -c'arith $1 index'
+## The index will be the first column
+$ asttable table.fits -c'arith $1 index swap'
+@end example
+@end table
+@item indexonly
+Similar to @code{index}, except that the top operand is popped from the stack
and is no longer available afterwards.
+@item counter
+Similar to @code{index}, except that counting starts from one (not zero as in
@code{index}).
+Counting from one is usually necessary when adding row counters in tables,
like below:
-@node Fourier series, Fourier transform, Circles and the complex plane,
Frequency domain and Fourier operations
-@subsubsection Fourier series
-In astronomical images, our variable (brightness, or number of
photo-electrons, or signal to be more generic) is recorded over the 2D spatial
surface of a camera pixel.
-However to make things easier to understand, here we will assume that the
signal is recorded in 1D (assume one row of the 2D image pixels).
-Also for this section and the next (@ref{Fourier transform}) we will be
talking about the signal before it is digitized or pixelated.
-Let's assume that we have the continuous function @mymath{f(l)} which is
integrable in the interval @mymath{[l_0, l_0+L]} (always true in practical
cases like images).
-Take @mymath{l_0} as the position of the first pixel in the assumed row of the
image and @mymath{L} as the width of the image along that row.
-The units of @mymath{l_0} and @mymath{L} can be in any spatial units (for
example, meters) or an angular unit (like radians) multiplied by a fixed
distance which is more common.
+@example
+$ asttable table.fits -c'arith $1 counter swap'
+@end example
-To approximate @mymath{f(l)} over this interval, we need to find a set of
frequencies and their corresponding `magnitude's (see @ref{Circles and the
complex plane}).
-Therefore our aim is to show @mymath{f(l)} as the following sum of periodic
functions:
+@item counteronly
+Similar to @code{counter}, but the top operand before it is popped (no longer
available).
+@item size
+Size of the dataset along a given FITS (or FORTRAN) dimension (counting from
1).
+The desired dimension should be the first popped operand and the dataset must
be the second popped operand.
+The output will be a single unsigned integer (dimensions cannot be negative).
+For example, the following command will produce the size of the first
extension/HDU (the default HDU) of @file{a.fits} along the second FITS axis.
-@dispmath{
-f(l)=\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}n\over L}l} }
+@example
+$ astarithmetic a.fits 2 size
+@end example
+@cartouche
@noindent
-Note that the different frequencies (@mymath{2{\pi}n/L}, in units of cycles
per meters for example) are not arbitrary.
-They are all integer multiples of the fundamental frequency of
@mymath{\omega_0=2\pi/L}.
-Recall that @mymath{L} was the length of the signal we want to model.
-Therefore, we see that the smallest possible frequency (or the frequency
resolution) in the end, depends on the length we observed the signal or
@mymath{L}.
-In the case of each dimension on an image, this is the size of the image in
the respective dimension.
-The frequencies have been defined in this ``harmonic'' fashion to insure that
the final sum is periodic outside of the @mymath{[l_0, l_0+L]} interval too.
-At this point, you might be thinking that the sky is not periodic with the
same period as my camera's view angle.
-You are absolutely right! The important thing is that since your camera's
observed region is the only region we are ``observing'' and will be using, the
rest of the sky is irrelevant; so we can safely assume the sky is periodic
outside of it.
-However, this working assumption will haunt us later in @ref{Edges in the
frequency domain}.
+@strong{Not optimal:} This operator reads the top element on the stack and
then simply reads its size along the given dimension.
+On a small dataset this won't consume much RAM, but if you want to put this in
a pipeline or use it on large image, the extra RAM and slow operation can
become meaningful.
+To avoid such issues, you can read the size along the given dimension using
the @option{--keyvalue} option of @ref{Keyword inspection and manipulation}.
+For example, in the code below, the X axis position of every pixel is returned:
-The frequencies are thus determined by definition.
-So all we need to do is to find the coefficients (@mymath{c_n}), or
magnitudes, or radii of the circles for each frequency which is identified with
the integer @mymath{n}.
-Fourier's approach was to multiply both sides with a fixed term:
+@example
+$ width=$(astfits image.fits --keyvalue=NAXIS1 -q)
+$ astarithmetic image.fits indexonly $width % -opix-x.fits
+@end example
+@end cartouche
+@end table
-@dispmath{
-f(l)e^{-i{2{\pi}m\over
L}l}=\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}(n-m)\over L}l}
-}
-@noindent
-where @mymath{m>0}@footnote{ We could have assumed @mymath{m<0} and set the
exponential to positive, but this is more clear.}.
-We can then integrate both sides over the observation period:
+@node Building new dataset and stack management, Operand storage in memory or
a file, Size and position operators, Arithmetic operators
+@subsubsection Building new dataset and stack management
-@dispmath{
-\int_{l_0}^{l_0+L}f(l)e^{-i{2{\pi}m\over L}l}dl
-=\int_{l_0}^{l_0+L}\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}(n-m)\over
L}l}dl=\displaystyle\sum_{n=-\infty}^{\infty}c_n\int_{l_0}^{l_0+L}e^{i{2{\pi}(n-m)\over
L}l}dl
-}
+With the operator here, you can create a new dataset from scratch to start
certain operations without any input data.
-@noindent
-Both @mymath{n} and @mymath{m} are positive integers.
-Also, we know that a complex exponential is periodic so after one period
(@mymath{L}) it comes back to its starting point.
-Therefore @mymath{\int_{l_0}^{l_0+L}e^{2{\pi}k/L}dl=0} for any @mymath{k>0}.
-However, when @mymath{k=0}, this integral becomes:
@mymath{\int_{l_0}^{l_0+T}e^0dt=\int_{l_0}^{l_0+T}dt=T}.
-Hence since the integral will be zero for all @mymath{n{\neq}m}, we get:
+@table @command
+@item makenew
+Create a new dataset that only has zero values.
+The number of dimensions is read as the first popped operand and the number of
elements along each dimension are the next popped operand (in reverse of the
popping order).
+The type of the new dataset is an unsigned 8-bit integer and all pixel values
have a value of zero.
+For example, if you want to create a new 100 by 200 pixel image, you can run
this command:
-@dispmath{
-\displaystyle\sum_{n=-\infty}^{\infty}c_n\int_{l_0}^{l_0+T}e^{i{2{\pi}(n-m)\over
L}l}dl=Lc_m }
+@example
+$ astarithmetic 100 200 2 makenew
+@end example
@noindent
-The origin of the axis is fundamentally an arbitrary position.
-So let's set it to the start of the image such that @mymath{l_0=0}.
-So we can find the ``magnitude'' of the frequency @mymath{2{\pi}m/L} within
@mymath{f(l)} through the relation:
-
-@dispmath{ c_m={1\over L}\int_{0}^{L}f(l)e^{-i{2{\pi}m\over L}l}dl }
-
+To further extend the example, you can use any of the noise-making operators
to add noise to this new dataset (see @ref{Random number generators}), like the
command below:
+@example
+$ astarithmetic 100 200 2 makenew 5 mknoise-sigma
+@end example
-@node Fourier transform, Dirac delta and comb, Fourier series, Frequency
domain and Fourier operations
-@subsubsection Fourier transform
-In @ref{Fourier series}, we had to assume that the function is periodic
outside of the desired interval with a period of @mymath{L}.
-Therefore, assuming that @mymath{L\rightarrow\infty} will allow us to work
with any function.
-However, with this approximation, the fundamental frequency
(@mymath{\omega_0}) or the frequency resolution that we discussed in
@ref{Fourier series} will tend to zero: @mymath{\omega_0\rightarrow0}.
-In the equation to find @mymath{c_m}, every @mymath{m} represented a frequency
(multiple of @mymath{\omega_0}) and the integration on @mymath{l} removes the
dependence of the right side of the equation on @mymath{l}, making it only a
function of @mymath{m} or frequency.
-Let's define the following two variables:
+@item constant
+Return an operand that will have a constant value (first popped operand) in
all its elements.
+The number of elements is read from the second popped operand.
+The second popped operand is only used for its number of elements, its numeric
data type, or its values are fully ignored and it is later freed.
-@dispmath{\omega{\equiv}m\omega_0={2{\pi}m\over L}}
+@cindex Provenance
+Here is one useful scenario for this operator in tables: you want to merge the
objects/rows of some catalogs together, but you first want to give each source
catalog a label/counter that distinguishes between the source of each rows in
the merged/final catalog (using @ref{Invoking asttable}).
+The steps below show the the usage of this.
-@dispmath{F(\omega){\equiv}Lc_m}
+@example
+## Add label 1 to the RA, Dec, magnitude and magnitude error
+## rows of the first catalog.
+$ asttable cat-1.fits -cRA,DEC,MAG,MAG_ERR \
+ -c'arith $1 1 constant' --output=tab-1.fits
-@noindent
-The equation to find the coefficients of each frequency in
-@ref{Fourier series} thus becomes:
+## Similar to above, but for the second catalog.
+$ asttable cat-2.fits -cRA,DEC,MAG,MAG_ERR \
+ -c'arith $1 2 constant' --output=tab-2.fits
-@dispmath{ F(\omega)=\int_{-\infty}^{\infty}f(l)e^{-i{\omega}l}dl.}
+## Concatenate (merge/blend) the rows of the two tables into
+## one for the 5 columns, but also add a counter for each
+## object or row in the final catalog.
+$ asttable tab-1.fits --catrowfile=tab-2.fits \
+ -c'arith $1 counteronly' \
+ -cRA,DEC,MAG,MAG_ERR,5 --output=merged.fits \
+ --colmetadata=1,ID_MERGED,counter,"Merged ID." \
+ --colmetadata=6,SOURCE-CAT,counter,"Source ID."
-@noindent
-The function @mymath{F(\omega)} is thus the @emph{Fourier transform} of
@mymath{f(l)} in the frequency domain.
-So through this transformation, we can find (analyze) the magnitudes of the
constituting frequencies or the value in the frequency space@footnote{As we
discussed before, this `magnitude' can be interpreted as the radius of the
circle rotating at this frequency in the epicyclic interpretation of the
Fourier series, see @ref{epicycle} and @ref{iandtime}.} of our spatial input
function.
-The great thing is that we can also do the reverse and later synthesize the
input function from its Fourier transform.
-Let's do it: with the approximations above, multiply the right side of the
definition of the Fourier Series (@ref{Fourier series}) with
@mymath{1=L/L=({\omega_0}L)/(2\pi)}:
+## Add keyword information on each input. It is very important
+## to preserve this within the merged catalog. If the tables
+## came from public databases (for example on VizieR), give
+## their public identifier as the value.
+$ astfits merged.fits --write=/,"Source catalogs" \
+ --write=CATSRC1,"I/355/gaiadr3","VizieR ID." \
+ --write=CATSRC2,"Jane Doe","Name of source."
-@dispmath{ f(l)={1\over
-2\pi}\displaystyle\sum_{n=-\infty}^{\infty}Lc_ne^{{2{\pi}in\over
-L}l}\omega_0={1\over
-2\pi}\displaystyle\sum_{n=-\infty}^{\infty}F(\omega)e^{i{\omega}l}\Delta\omega
-}
+## Check the metadata in 'merged.fits' and clean the
+## temporary files.
+$ rm tab-1.fits tab-2.fits
+$ astfits merged.fits -h1
+@end example
+Like most operators, @code{constant} is not limited to tables, you can also
apply it on images.
+In the example below, we'll use @code{constant} to set all the pixels of the
input image to NaN (which is necessary in scenarios that you need to include in
an image in an analysis, but you don't want its pixels to affect the
processing):
-@noindent
-To find the right most side of this equation, we renamed @mymath{\omega_0} as
@mymath{\Delta\omega} because it was our resolution, @mymath{2{\pi}n/L} was
written as @mymath{\omega} and finally, @mymath{Lc_n} was written as
@mymath{F(\omega)} as we defined above.
-Now, as @mymath{L\rightarrow\infty}, @mymath{\Delta\omega\rightarrow0} so we
can write:
+@example
+$ astarithmetic image.fits nan constant
+@end example
-@dispmath{ f(l)={1\over
- 2\pi}\int_{-\infty}^{\infty}F(\omega)e^{i{\omega}l}d\omega }
+@item swap
+Swap the top two operands on the stack.
+For example the @code{index} operator doesn't pop with the top operand (the
input to @code{index}), it just adds the index image to the stack.
+In case you want your next operation to be on the input to @code{index}, you
can simply call @code{swap} and continue the operations on that image, while
keeping the indexed pixels for later steps.
+In the example below we are using the @option{--writeall} option to write the
full stack and if you open the outputs you will see that the stack order has
changed.
-Together, these two equations provide us with a very powerful set of tools
that we can use to process (analyze) and recreate (synthesize) the input signal.
-Through the first equation, we can break up our input function into its
constituent frequencies and analyze it, hence it is also known as
@emph{analysis}.
-Using the second equation, we can synthesize or make the input function from
the known frequencies and their magnitudes.
-Thus it is known as @emph{synthesis}.
-Here, we symbolize the Fourier transform (analysis) and its inverse
(synthesis) of a function @mymath{f(l)} and its Fourier Transform
@mymath{F(\omega)} as @mymath{{\cal F}[f]} and @mymath{{\cal F}^{-1}[F]}.
+@example
+## Index image is written in HDU 1.
+$ astarithmetic image.fits index --writeall \
+ --output=ind-first.fits
+## image.fits in HDU 1.
+$ astarithmetic image.fits index swap --writeall \
+ --output=img-first.fits
+@end example
+@end table
-@node Dirac delta and comb, Convolution theorem, Fourier transform, Frequency
domain and Fourier operations
-@subsubsection Dirac delta and comb
+@node Operand storage in memory or a file, , Building new dataset and stack
management, Arithmetic operators
+@subsubsection Operand storage in memory or a file
-The Dirac @mymath{\delta} (delta) function (also known as an impulse) is the
way that we convert a continuous function into a discrete one.
-It is defined to satisfy the following integral:
+In your early days of using Gnuastro, to do multiple operations, it is likely
that you will simply call Arithmetic (or Table, with column arithmetic)
multiple times: feed the output file of the first call to the second call.
+But as you get more proficient in the reverse polish notation, you will find
yourself combining many operations into one call.
+This greatly speeds up your operation, because instead of writing the dataset
to a file in one command, and reading it in the next command, it will just keep
the intermediate dataset in memory!
-@dispmath{\int_{-\infty}^{\infty}\delta(l)dl=1}
+But adding more complexity to your operations, can make them much harder to
debug, or extend even further.
+Therefore in this section we have some special operators that behave
differently from the rest: they do not touch the contents of the data, only
where/how they are stored.
+They are designed to do complex operations, without necessarily having a
complex command.
-@noindent
-When integrated with another function, it gives that function's value at
@mymath{l=0}:
-
-@dispmath{\int_{-\infty}^{\infty}f(l)\delta(l)dt=f(0)}
-
-@noindent
-An impulse positioned at another point (say @mymath{l_0}) is written as
@mymath{\delta(l-l_0)}:
+@table @command
+@item set-AAA
+Set the characters after the dash (@code{AAA} in the case shown here) as a
name for the first popped operand on the stack.
+The named dataset will be freed from memory as soon as it is no longer needed,
or if the name is reset to refer to another dataset later in the command.
+This operator thus enables reusability of a dataset without having to reread
it from a file every time it is necessary during a process.
+When a dataset is necessary more than once, this operator can thus help
simplify reading/writing on the command-line (thus avoiding potential bugs),
while also speeding up the processing.
-@dispmath{\int_{-\infty}^{\infty}f(l)\delta(l-l_0)dt=f(l_0)}
+Like all operators, this operator pops the top operand off of the main
processing stack, but unlike other operands, it will not add anything back to
the stack immediately.
+It will keep the popped dataset in memory through a separate list of named
datasets (not on the main stack).
+That list will be used to add/copy any requested dataset to the main
processing stack when the name is called.
-@noindent
-The Dirac @mymath{\delta} function also operates similarly if we use
summations instead of integrals.
-The Fourier transform of the delta function is:
+The name to give the popped dataset is part of the operator's name.
+For example, the @code{set-a} operator of the command below, gives the name
``@code{a}'' to the contents of @file{image.fits}.
+This name is then used instead of the actual filename to multiply the dataset
by two.
-@dispmath{{\cal
F}[\delta(l)]=\int_{-\infty}^{\infty}\delta(l)e^{-i{\omega}l}dl=e^{-i{\omega}0}=1}
+@example
+$ astarithmetic image.fits set-a a 2 x
+@end example
-@dispmath{{\cal
F}[\delta(l-l_0)]=\int_{-\infty}^{\infty}\delta(l-l_0)e^{-i{\omega}l}dl=e^{-i{\omega}l_0}}
+The name can be any string, but avoid strings ending with standard filename
suffixes (for example, @file{.fits})@footnote{A dataset name like @file{a.fits}
(which can be set with @command{set-a.fits}) will cause confusion in the
initial parser of Arithmetic.
+It will assume this name is a FITS file, and if it is used multiple times,
Arithmetic will abort, complaining that you have not provided enough HDUs.}.
-@noindent
-From the definition of the Dirac @mymath{\delta} we can also define a
-Dirac comb (@mymath{{\rm III}_P}) or an impulse train with infinite
-impulses separated by @mymath{P}:
+One example of the usefulness of this operator is in the @code{where} operator.
+For example, let's assume you want to mask all pixels larger than @code{5} in
@file{image.fits} (extension number 1) with a NaN value.
+Without setting a name for the dataset, you have to read the file two times
from memory in a command like this:
-@dispmath{
-{\rm III}_P(l)\equiv\displaystyle\sum_{k=-\infty}^{\infty}\delta(l-kP) }
+@example
+$ astarithmetic image.fits image.fits 5 gt nan where -g1
+@end example
+But with this operator you can simply give @file{image.fits} the name @code{i}
and simplify the command above to the more readable one below (which greatly
helps when the filename is long):
-@noindent
-@mymath{P} is chosen to represent ``pixel width'' later in @ref{Sampling
theorem}.
-Therefore the Dirac comb is periodic with a period of @mymath{P}.
-We have intentionally used a different name for the period of the Dirac comb
compared to the input signal's length of observation that we showed with
@mymath{L} in @ref{Fourier series}.
-This difference is highlighted here to avoid confusion later when these two
periods are needed together in @ref{Discrete Fourier transform}.
-The Fourier transform of the Dirac comb will be necessary in @ref{Sampling
theorem}, so let's derive it.
-By its definition, it is periodic, with a period of @mymath{P}, so the Fourier
coefficients of its Fourier Series (@ref{Fourier series}) can be calculated
within one period:
+@example
+$ astarithmetic image.fits set-i i i 5 gt nan where
+@end example
-@dispmath{{\rm
III}_P=\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}n\over
-P}l}}
+@item repeat
+Add N copies of the second popped operand to the stack of operands.
+N is the first popped operand.
+For example, let's assume @file{image.fits} is a @mymath{100\times100} image.
+The output of the command below will be a 3D datacube of size
@mymath{100\times100\times20} voxels (volume-pixels):
-@noindent
-We can now find the @mymath{c_n} from @ref{Fourier series}:
+@example
+$ astarithmetic image.fits 20 repeat 20 add-dimension-slow
+@end example
-@dispmath{
-c_n={1\over P}\int_{-P/2}^{P/2}\delta(l)e^{-i{2{\pi}n\over P}l}
-={1\over P}\quad\quad \rightarrow \quad\quad
-{\rm III}_P={1\over P}\displaystyle\sum_{n=-\infty}^{\infty}e^{i{2{\pi}n\over
P}l}
-}
+@item tofile-AAA
+Write the top operand on the operands stack into a file called @code{AAA} (can
be any FITS file name) without changing the operands stack.
+If you do not need the dataset any more and would like to free it, see the
@code{tofilefree} operator below.
-@noindent
-So we can write the Fourier transform of the Dirac comb as:
+By default, any file that is given to this operator is deleted before
Arithmetic actually starts working on the input datasets.
+The deletion can be deactivated with the @option{--dontdelete} option (as in
all Gnuastro programs, see @ref{Input output options}).
+If the same FITS file is given to this operator multiple times, it will
contain multiple extensions (in the same order that it was called.
-@dispmath{
-{\cal F}[{\rm III}_P]=\int_{-\infty}^{\infty}{\rm III}_Pe^{-i{\omega}l}dl
-={1\over
P}\displaystyle\sum_{n=-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-i(\omega-{2{\pi}n\over
P})l}dl={1\over
P}\displaystyle\sum_{n=-\infty}^{\infty}\delta\left(\omega-{2{\pi}n\over
P}\right)
-}
+For example, the operator @command{tofile-check.fits} will write the top
operand to @file{check.fits}.
+Since it does not modify the operands stack, this operator is very convenient
when you want to debug, or understanding, a string of operators and operands
given to Arithmetic: simply put @command{tofile-AAA} anywhere in the process to
see what is happening behind the scenes without modifying the overall process.
+@item tofilefree-AAA
+Similar to the @code{tofile} operator, with the only difference that the
dataset that is written to a file is popped from the operand stack and freed
from memory (cannot be used any more).
-@noindent
-In the last step, we used the fact that the complex exponential is a periodic
function, that @mymath{n} is an integer and that as we defined in @ref{Fourier
transform}, @mymath{\omega{\equiv}m\omega_0}, where @mymath{m} was an integer.
-The integral will be zero for any @mymath{\omega} that is not equal to
@mymath{2{\pi}n/P}, a more complete explanation can be seen in @ref{Fourier
series}.
-Therefore, while in the spatial domain the impulses had spacing of @mymath{P}
(meters for example), in the frequency space, the spacing between the different
impulses are @mymath{2\pi/P} cycles per meters.
+@end table
-@node Convolution theorem, Sampling theorem, Dirac delta and comb, Frequency
domain and Fourier operations
-@subsubsection Convolution theorem
+@node Invoking astarithmetic, , Arithmetic operators, Arithmetic
+@subsection Invoking Arithmetic
-The convolution (shown with the @mymath{\ast} operator) of the two
-functions @mymath{f(l)} and @mymath{h(l)} is defined as:
+Arithmetic will do pixel to pixel arithmetic operations on the individual
pixels of input data and/or numbers.
+For the full list of operators with explanations, please see @ref{Arithmetic
operators}.
+Any operand that only has a single element (number, or single pixel FITS
image) will be read as a number, the rest of the inputs must have the same
dimensions.
+The general template is:
-@dispmath{
-c(l)\equiv[f{\ast}h](l)=\int_{-\infty}^{\infty}f(\tau)h(l-\tau)d\tau
-}
+@example
+$ astarithmetic [OPTION...] ASTRdata1 [ASTRdata2] OPERATOR ...
+@end example
@noindent
-See @ref{Convolution process} for a more detailed physical (pixel based)
interpretation of this definition.
-The Fourier transform of convolution (@mymath{C(\omega)}) can be written as:
+One line examples:
-@dispmath{
- C(\omega)=\int_{-\infty}^{\infty}[f{\ast}h](l)e^{-i{\omega}l}dl=
-
\int_{-\infty}^{\infty}f(\tau)\left[\int_{-\infty}^{\infty}h(l-\tau)e^{-i{\omega}l}dl\right]d\tau
-}
+@example
+## Calculate (10.32-3.84)^2.7 quietly (will just print 155.329):
+$ astarithmetic -q 10.32 3.84 - 2.7 pow
-@noindent
-To solve the inner integral, let's define @mymath{s{\equiv}l-\tau}, so
-that @mymath{ds=dl} and @mymath{l=s+\tau} then the inner integral
-becomes:
+## Inverse the input image (1/pixel):
+$ astarithmetic 1 image.fits / --out=inverse.fits
-@dispmath{
-\int_{-\infty}^{\infty}h(l-\tau)e^{-i{\omega}l}dl=
-\int_{-\infty}^{\infty}h(s)e^{-i{\omega}(s+\tau)}ds=e^{-i{\omega}\tau}\int_{-\infty}^{\infty}h(s)e^{-i{\omega}s}ds=H(\omega)e^{-i{\omega}\tau}
-}
+## Multiply each pixel in image by -1:
+$ astarithmetic image.fits -1 x --out=negative.fits
-@noindent
-where @mymath{H(\omega)} is the Fourier transform of @mymath{h(l)}.
-Substituting this result for the inner integral above, we get:
+## Subtract extension 4 from extension 1 (counting from zero):
+$ astarithmetic image.fits image.fits - --out=skysub.fits \
+ --hdu=1 --hdu=4
-@dispmath{
-C(\omega)=H(\omega)\int_{-\infty}^{\infty}f(\tau)e^{-i{\omega}\tau}d\tau=H(\omega)F(\omega)=F(\omega)H(\omega)
-}
+## Add two images, then divide them by 2 (2 is read as floating point):
+## Note that without the '.0', the '2' will be read/used as an integer.
+$ astarithmetic image1.fits image2.fits + 2.0 / --out=average.fits
-@noindent
-where @mymath{F(\omega)} is the Fourier transform of @mymath{f(l)}.
-So multiplying the Fourier transform of two functions individually, we get the
Fourier transform of their convolution.
-The convolution theorem also proves a relation between the convolutions in the
frequency space.
-Let's define:
+## Use Arithmetic's average operator:
+$ astarithmetic image1.fits image2.fits average --out=average.fits
-@dispmath{D(\omega){\equiv}F(\omega){\ast}H(\omega)}
+## Calculate the median of three images in three separate extensions:
+$ astarithmetic img1.fits img2.fits img3.fits median \
+ -h0 -h1 -h2 --out=median.fits
+@end example
-@noindent
-Applying the inverse Fourier Transform or synthesis equation (@ref{Fourier
transform}) to both sides and following the same steps above, we get:
+Arithmetic's notation for giving operands to operators is fully described in
@ref{Reverse polish notation}.
+The output dataset is last remaining operand on the stack.
+When the output dataset a single number, and @option{--output} is not called,
it will be printed on the standard output (command-line).
+When the output is an array, it will be stored as a file.
-@dispmath{d(l)=f(l)h(l)}
+The name of the final file can be specified with the @option{--output} option,
but if it is not given (and the output dataset has more than one element),
Arithmetic will use ``automatic output'' on the name of the first FITS image
encountered to generate an output file name, see @ref{Automatic output}.
+By default, if the output file already exists, it will be deleted before
Arithmetic starts operation.
+However, this can be disabled with the @option{--dontdelete} option (see
below).
+At any point during Arithmetic's operation, you can also write the top operand
on the stack to a file, using the @code{tofile} or @code{tofilefree} operators,
see @ref{Arithmetic operators}.
-@noindent
-Where @mymath{d(l)} is the inverse Fourier transform of @mymath{D(\omega)}.
-We can therefore re-write the two equations above formally as the convolution
theorem:
+By default, the world coordinate system (WCS) information of the output
dataset will be taken from the first input image (that contains a WCS) on the
command-line.
+This can be modified with the @option{--wcsfile} and @option{--wcshdu} options
described below.
+When the @option{--quiet} option is not given, the name and extension of the
dataset used for the output's WCS is printed on the command-line.
-@dispmath{
- {\cal F}[f{\ast}h]={\cal F}[f]{\cal F}[h]
-}
+Through operators like those starting with @code{collapse-}, the
dimensionality of the inputs may not be the same as the outputs.
+By default, when the output is 1D, Arithmetic will write it as a table, not an
image/array.
+The format of the output table (plain text or FITS ASCII or binary) can be set
with the @option{--tableformat} option, see @ref{Input output options}).
+You can disable this feature (write 1D arrays as FITS images/arrays, or to the
standard output) with the @option{--onedasimage} or @option{--onedonstdout}
options.
-@dispmath{
- {\cal F}[fh]={\cal F}[f]\ast{\cal F}[h]
-}
+See @ref{Common options} for a review of the options in all Gnuastro programs.
+Arithmetic just redefines the @option{--hdu} and @option{--dontdelete} options
as explained below.
-Besides its usefulness in blurring an image by convolving it with a given
kernel, the convolution theorem also enables us to do another very useful
operation in data analysis: to match the blur (or PSF) between two images taken
with different telescopes/cameras or under different atmospheric conditions.
-This process is also known as de-convolution.
-Let's take @mymath{f(l)} as the image with a narrower PSF (less blurry) and
@mymath{c(l)} as the image with a wider PSF which appears more blurred.
-Also let's take @mymath{h(l)} to represent the kernel that should be convolved
with the sharper image to create the more blurry image.
-Above, we proved the relation between these three images through the
convolution theorem.
-But there, we assumed that @mymath{f(l)} and @mymath{h(l)} are known (given)
and the convolved image is desired.
+@table @option
-In de-convolution, we have @mymath{f(l)} --the sharper image-- and
@mymath{f*h(l)} --the more blurry image-- and we want to find the kernel
@mymath{h(l)}.
-The solution is a direct result of the convolution theorem:
+@item -h INT/STR
+@itemx --hdu INT/STR
+The header data unit of the input FITS images, see @ref{Input output options}.
+Unlike most options in Gnuastro (which will ultimately only have one value for
this option), Arithmetic allows @option{--hdu} to be called multiple times and
the value of each invocation will be stored separately (for the unlimited
number of input images you would like to use).
+Recall that for other programs this (common) option only takes a single value.
+So in other programs, if you specify it multiple times on the command-line,
only the last value will be used and in the configuration files, it will be
ignored if it already has a value.
-@dispmath{
- {\cal F}[h]={{\cal F}[f{\ast}h]\over {\cal F}[f]}
- \quad\quad
- {\rm or}
- \quad\quad
- h(l)={\cal F}^{-1}\left[{{\cal F}[f{\ast}h]\over {\cal F}[f]}\right]
-}
+The order of the values to @option{--hdu} has to be in the same order as input
FITS images.
+Options are first read from the command-line (from left to right), then
top-down in each configuration file, see @ref{Configuration file precedence}.
-While this works really nice, it has two problems:
+If the number of HDUs is less than the number of input images, Arithmetic will
abort and notify you.
+However, if there are more HDUs than FITS images, there is no problem: they
will be used in the given order (every time a FITS image comes up on the stack)
and the extra HDUs will be ignored in the end.
+So there is no problem with having extra HDUs in the configuration files and
by default several HDUs with a value of @option{0} are kept in the system-wide
configuration file when you install Gnuastro.
-@itemize
+@item -g INT/STR
+@itemx --globalhdu INT/STR
+Use the value to this option as the HDU of all input FITS files.
+This option is very convenient when you have many input files and the dataset
of interest is in the same HDU of all the files.
+When this option is called, any values given to the @option{--hdu} option
(explained above) are ignored and will not be used.
-@item
-If @mymath{{\cal F}[f]} has any zero values, then the inverse Fourier
transform will not be a number!
+@item -w FITS
+@itemx --wcsfile FITS
+FITS Filename containing the WCS structure that must be written to the output.
+The HDU/extension should be specified with @option{--wcshdu}.
-@item
-If there is significant noise in the image, then the high frequencies of the
noise are going to significantly reduce the quality of the final result.
+When this option is used, the respective WCS will be read before any
processing is done on the command-line and directly used in the final output.
+If the given file does not have any WCS, then the default WCS (first file on
the command-line with WCS) will be used in the output.
-@end itemize
+This option will mostly be used when the default file (first of the set of
inputs) is not the one containing your desired WCS.
+But with this option, you can also use Arithmetic to rewrite/change the WCS of
an existing FITS dataset from another file:
-A standard solution to both these problems is the Weiner de-convolution
-algorithm@footnote{@url{https://en.wikipedia.org/wiki/Wiener_deconvolution}}.
+@example
+$ astarithmetic data.fits --wcsfile=other.fits -ofinal.fits
+@end example
-@node Sampling theorem, Discrete Fourier transform, Convolution theorem,
Frequency domain and Fourier operations
-@subsubsection Sampling theorem
+@item -W STR
+@itemx --wcshdu STR
+HDU/extension to read the WCS within the file given to @option{--wcsfile}.
+For more, see the description of @option{--wcsfile}.
-Our mathematical functions are continuous, however, our data collecting and
measuring tools are discrete.
-Here we want to give a mathematical formulation for digitizing the continuous
mathematical functions so that later, we can retrieve the continuous function
from the digitized recorded input.
-Assuming that we have a continuous function @mymath{f(l)}, then we can define
@mymath{f_s(l)} as the `sampled' @mymath{f(l)} through the Dirac comb (see
@ref{Dirac delta and comb}):
+@item --envseed
+Use the environment for the random number generator settings in operators that
need them (for example, @code{mknoise-sigma}).
+This is very important for obtaining reproducible results, for more see
@ref{Generating random numbers}.
-@dispmath{
-f_s(l)=f(l){\rm III}_P=\displaystyle\sum_{n=-\infty}^{\infty}f(l)\delta(l-nP)
-}
+@item -n STR
+@itemx --metaname=STR
+Metadata (name) of the output dataset.
+For a FITS image or table, the string given to this option is written in the
@code{EXTNAME} or @code{TTYPE1} keyword (respectively).
-@noindent
-The discrete data-element @mymath{f_k} (for example, a pixel in an
-image), where @mymath{k} is an integer, can thus be represented as:
+If this keyword is present in a FITS extension, it will be printed in the
table output of a command like @command{astfits image.fits} (for images) or
@command{asttable table.fits -i} (for tables).
+This metadata can be very helpful for yourself in the future (when you have
forgotten the details), so it is recommended to use this option for files that
should be archived or shared with colleagues.
-@dispmath{f_k=\int_{-\infty}^{\infty}f_s(l)dl=\int_{-\infty}^{\infty}f(l)\delta(l-kP)dt=f(kP)}
+@item -u STR
+@itemx --metaunit=STR
+Metadata (units) of the output dataset.
+For a FITS image or table, the string given to this option is written in the
@code{BUNIT} or @code{TTYPE1} keyword respectively.
+In the case of tables, recall that the Arithmetic program only outputs a
single column, you should use column arithmetic in Table for more than one
column (see @ref{Column arithmetic}).
+For more on the importance of metadata, see the description of
@option{--metaname}.
-Note that in practice, our discrete data points are not found in this fashion.
-Each detector pixel (in an image for example) has an area and averages the
signal it receives over that area, not a mathematical point as the Dirac
@mymath{\delta} function defines.
-However, as long as the variation in the signal over one detector pixel is not
significant, this can be a good approximation.
-Having put this issue to the side, we can now try to find the relation between
the Fourier transforms of the un-sampled @mymath{f(l)} and the sampled
@mymath{f_s(l)}.
-For a more clear notation, let's define:
+@item -c STR
+@itemx --metacomment=STR
+Metadata (comments) of the output dataset.
+For a FITS image or table, the string given to this option is written in the
@code{COMMENT} or @code{TCOMM1} keyword respectively.
+In the case of tables, recall that the Arithmetic program only outputs a
single column, you should use column arithmetic in Table for more than one
column (see @ref{Column arithmetic}).
+For more on the importance of metadata, see the description of
@option{--metaname}.
-@dispmath{F_s(\omega)\equiv{\cal F}[f_s]}
+@item -O
+@itemx --onedasimage
+Write final dataset as a FITS image/array even if it has a single dimension.
+By default, if the output is 1D, it will be written as a table, see above.
+If the output has more than one dimension, this option is redundant.
-@dispmath{D(\omega)\equiv{\cal F}[{\rm III}_P]}
+@item -s
+@itemx --onedonstdout
+Write final dataset (only when it is 1D) to standard output, not as a file.
+By default 1D datasets will be written as a table, see above.
+If the output has more than one dimension, this option is redundant.
-@noindent
-Then using the Convolution theorem (see @ref{Convolution theorem}),
-@mymath{F_s(\omega)} can be written as:
+@item -D
+@itemx --dontdelete
+Do Not delete the output file, or files given to the @code{tofile} or
@code{tofilefree} operators, if they already exist.
+Instead append the desired datasets to the extensions that already exist in
the respective file.
+Note it does not matter if the final output file name is given with the
@option{--output} option, or determined automatically.
-@dispmath{F_s(\omega)={\cal F}[f(l){\rm III}_P]=F(\omega){\ast}D(\omega)}
+Arithmetic treats this option differently from its default operation in other
Gnuastro programs (see @ref{Input output options}).
+If the output file exists, when other Gnuastro programs are called with
@option{--dontdelete}, they simply complain and abort.
+But when Arithmetic is called with @option{--dontdelete}, it will appended the
dataset(s) to the existing extension(s) in the file.
-@noindent
-Finally, from the definition of convolution and the Fourier transform
-of the Dirac comb (see @ref{Dirac delta and comb}), we get:
+@item -a
+@itemx --writeall
+Write all datasets on the stack as separate HDUs in the output file.
+This only affects datasets with multiple dimensions (or single-dimension
datasets when the @option{--onedasimg} is called).
+This option is useful to debug Arithmetic calls: to check all the images on
the stack while you are designing your operation.
+The top dataset on the stack will be on HDU number 1 of the output, the second
dataset will be on HDU number 2 and so on.
+@end table
-@dispmath{
-\eqalign{
-F_s(\omega) &= \int_{-\infty}^{\infty}F(\omega)D(\omega-\mu)d\mu \cr
-&= {1\over
P}\displaystyle\sum_{n=-\infty}^{\infty}\int_{-\infty}^{\infty}F(\omega)\delta\left(\omega-\mu-{2{\pi}n\over
P}\right)d\mu \cr
-&= {1\over P}\displaystyle\sum_{n=-\infty}^{\infty}F\left(
- \omega-{2{\pi}n\over P}\right).\cr }
-}
+Arithmetic accepts two kinds of input: images and numbers.
+Images are considered to be any of the inputs that is a file name of a
recognized type (see @ref{Arguments}) and has more than one element/pixel.
+Numbers on the command-line will be read into the smallest type (see
@ref{Numeric data types}) that can store them, so @command{-2} will be read as
a @code{char} type (which is signed on most systems and can thus keep negative
values), @command{2500} will be read as an @code{unsigned short} (all positive
numbers will be read as unsigned), while @code{3.1415926535897} will be read as
a @code{double} and @code{3.14} will be read as a @code{float}.
+To force a number to be read as float, put a @code{.} after it (possibly
followed by a zero for easier readability), or add an @code{f} after it.
+Hence while @command{5} will be read as an integer, @command{5.},
@command{5.0} or @command{5f} will be added to the stack as @code{float} (see
@ref{Reverse polish notation}).
-@mymath{F(\omega)} was only a simple function, see @ref{samplingfreq}(left).
-However, from the sampled Fourier transform function we see that
@mymath{F_s(\omega)} is the superposition of infinite copies of
@mymath{F(\omega)} that have been shifted, see @ref{samplingfreq}(right).
-From the equation, it is clear that the shift in each copy is @mymath{2\pi/P}.
+Unless otherwise stated (in @ref{Arithmetic operators}), the operators can
deal with numeric multiple data types (see @ref{Numeric data types}).
+For example, in ``@command{a.fits b.fits +}'', the image types can be
@code{long} and @code{float}.
+In such cases, C's internal type conversion will be used.
+The output type will be set to the higher-ranking type of the two inputs.
+Unsigned integer types have smaller ranking than their signed counterparts and
floating point types have higher ranking than the integer types.
+So the internal C type conversions done in the example above are equivalent to
this piece of C:
-@float Figure,samplingfreq
-@image{gnuastro-figures/samplingfreq, 15.2cm, , } @caption{Sampling causes
infinite repetition in the frequency domain.
-FT is an abbreviation for `Fourier transform'.
-@mymath{\omega_m} represents the maximum frequency present in the input.
-@mymath{F(\omega)} is only symmetric on both sides of 0 when the input is real
(not complex).
-In general @mymath{F(\omega)} is complex and thus cannot be simply plotted
like this.
-Here we have assumed a real Gaussian @mymath{f(t)} which has produced a
Gaussian @mymath{F(\omega)}.}
-@end float
+@example
+size_t i;
+long a[100];
+float b[100], out[100];
+for(i=0;i<100;++i) out[i]=a[i]+b[i];
+@end example
-The input @mymath{f(l)} can have any distribution of frequencies in it.
-In the example of @ref{samplingfreq}(left), the input consisted of a range of
frequencies equal to @mymath{\Delta\omega=2\omega_m}.
-Fortunately as @ref{samplingfreq}(right) shows, the assumed pixel size
(@mymath{P}) we used to sample this hypothetical function was such that
@mymath{2\pi/P>\Delta\omega}.
-The consequence is that each copy of @mymath{F(\omega)} has become completely
separate from the surrounding copies.
-Such a digitized (sampled) data set is thus called @emph{over-sampled}.
-When @mymath{2\pi/P=\Delta\omega}, @mymath{P} is just small enough to finely
separate even the largest frequencies in the input signal and thus it is known
as @emph{critically-sampled}.
-Finally if @mymath{2\pi/P<\Delta\omega} we are dealing with an
@emph{under-sampled} data set.
-In an under-sampled data set, the separate copies of @mymath{F(\omega)} are
going to overlap and this will deprive us of recovering high constituent
frequencies of @mymath{f(l)}.
-The effects of under-sampling in an image with high rates of change (for
example, a brick wall imaged from a distance) can clearly be visually seen and
is known as @emph{aliasing}.
+@noindent
+Relying on the default C type conversion significantly speeds up the
processing and also requires less RAM (when using very large images).
-When the input @mymath{f(l)} is composed of a finite range of frequencies,
@mymath{f(l)} is known as a @emph{band-limited} function.
-The example in @ref{samplingfreq}(left) was a nice demonstration of such a
case: for all @mymath{\omega<-\omega_m} or @mymath{\omega>\omega_m}, we have
@mymath{F(\omega)=0}.
-Therefore, when the input function is band-limited and our detector's pixels
are placed such that we have critically (or over-) sampled it, then we can
exactly reproduce the continuous @mymath{f(l)} from the discrete or digitized
samples.
-To do that, we just have to isolate one copy of @mymath{F(\omega)} from the
infinite copies and take its inverse Fourier transform.
+Some operators can only work on integer types (of any length, for example,
bitwise operators) while others only work on floating point types, (currently
only the @code{pow} operator).
+In such cases, if the operand type(s) are different, an error will be printed.
+Arithmetic also comes with internal type conversion operators which you can
use to convert the data into the appropriate type, see @ref{Arithmetic
operators}.
-This ability to exactly reproduce the continuous input from the sampled or
digitized data leads us to the @emph{sampling theorem} which connects the
inherent property of the continuous signal (its maximum frequency) to that of
the detector (the spacing between its pixels).
-The sampling theorem states that the full (continuous) signal can be recovered
when the pixel size (@mymath{P}) and the maximum constituent frequency in the
signal (@mymath{\omega_m}) have the following relation@footnote{This equation
is also shown in some places without the @mymath{2\pi}.
-Whether @mymath{2\pi} is included or not depends on how you define the
frequency}:
+@cindex Options
+The hyphen (@command{-}) can be used both to specify options (see
@ref{Options}) and also to specify a negative number which might be necessary
in your arithmetic.
+In order to enable you to do this, Arithmetic will first parse all the input
strings and if the first character after a hyphen is a digit, then that hyphen
is temporarily replaced by the vertical tab character which is not commonly
used.
+The arguments are then parsed and these strings will not be specified as an
option.
+Then the given arguments are parsed and any vertical tabs are replaced back
with a hyphen so they can be read as negative numbers.
+Therefore, as long as the names of the files you want to work on, do not start
with a vertical tab followed by a digit, there is no problem.
+An important consequence of this implementation is that you should not write
negative fractions like this: @command{-.3}, instead write them as
@command{-0.3}.
-@dispmath{{2\pi\over P}>2\omega_m}
+@cindex AWK
+@cindex GNU AWK
+Without any images, Arithmetic will act like a simple calculator and print the
resulting output number on the standard output like the first example above.
+If you really want such calculator operations on the command-line, AWK (GNU
AWK is the most common implementation) is much faster, easier and much more
powerful.
+For example, the numerical one-line example above can be done with the
following command.
+In general AWK is a fantastic tool and GNU AWK has a wonderful manual
(@url{https://www.gnu.org/software/gawk/manual/}).
+So if you often confront situations like this, or have to work with large text
tables/catalogs, be sure to checkout AWK and simplify your life.
-@noindent
-This relation was first formulated by Harry Nyquist (1889 -- 1976 A.D.) in
1928 and formally proved in 1949 by Claude E. Shannon (1916 -- 2001 A.D.) in
what is now known as the Nyquist-Shannon sampling theorem.
-In signal processing, the signal is produced (synthesized) by a transmitter
and is received and de-coded (analyzed) by a receiver.
-Therefore producing a band-limited signal is necessary.
+@example
+$ echo "" | awk '@{print (10.32-3.84)^2.7@}'
+155.329
+@end example
-In astronomy, we do not produce the shapes of our targets, we are only
observers.
-Galaxies can have any shape and size, therefore ideally, our signal is not
band-limited.
-However, since we are always confined to observing through an aperture, the
aperture will cause a point source (for which @mymath{\omega_m=\infty}) to be
spread over several pixels.
-This spread is quantitatively known as the point spread function or PSF.
-This spread does blur the image which is undesirable; however, for this
analysis it produces the positive outcome that there will be a finite
@mymath{\omega_m}.
-Though we should caution that any detector will have noise which will add lots
of very high frequency (ideally infinite) changes between the pixels.
-However, the coefficients of those noise frequencies are usually exceedingly
small.
-@node Discrete Fourier transform, Fourier operations in two dimensions,
Sampling theorem, Frequency domain and Fourier operations
-@subsubsection Discrete Fourier transform
-As we have stated several times so far, the input image is a digitized,
pixelated or discrete array of values (@mymath{f_s(l)}, see @ref{Sampling
theorem}).
-The input is not a continuous function.
-Also, all our numerical calculations can only be done on a sampled, or
discrete Fourier transform.
-Note that @mymath{F_s(\omega)} is not discrete, it is continuous.
-One way would be to find the analytic @mymath{F_s(\omega)}, then sample it at
any desired ``freq-pixel''@footnote{We are using the made-up word
``freq-pixel'' so they are not confused with spatial domain ``pixels''.}
spacing.
-However, this process would involve two steps of operations and computers in
particular are not too good at analytic operations for the first step.
-So here, we will derive a method to directly find the `freq-pixel'ated
@mymath{F_s(\omega)} from the pixelated @mymath{f_s(l)}.
-Let's start with the definition of the Fourier transform (see @ref{Fourier
transform}):
-@dispmath{F_s(\omega)=\int_{-\infty}^{\infty}f_s(l)e^{-i{\omega}l}dl }
-@noindent
-From the definition of @mymath{f_s(\omega)} (using @mymath{x} instead of
@mymath{n}) we get:
-@dispmath{
-\eqalign{
- F_s(\omega) &= \displaystyle\sum_{x=-\infty}^{\infty}
-
\int_{-\infty}^{\infty}f(l)\delta(l-xP)e^{-i{\omega}l}dl \cr
- &= \displaystyle\sum_{x=-\infty}^{\infty}
- f_xe^{-i{\omega}xP}
-}
-}
-@noindent
-Where @mymath{f_x} is the value of @mymath{f(l)} on the point @mymath{x} or
the value of the @mymath{x}th pixel.
-As shown in @ref{Sampling theorem} this function is infinitely periodic with a
period of @mymath{2\pi/P}.
-So all we need is the values within one period: @mymath{0<\omega<2\pi/P}, see
@ref{samplingfreq}.
-We want @mymath{X} samples within this interval, so the frequency difference
between each frequency sample or freq-pixel is @mymath{1/XP}.
-Hence we will evaluate the equation above on the points at:
-@dispmath{\omega={u\over XP} \quad\quad u = 0, 1, 2, ..., X-1}
-@noindent
-Therefore the value of the freq-pixel @mymath{u} in the frequency
-domain is:
-@dispmath{F_u=\displaystyle\sum_{x=0}^{X-1} f_xe^{-i{ux\over X}} }
-@noindent
-Therefore, we see that for each freq-pixel in the frequency domain, we are
going to need all the pixels in the spatial domain@footnote{So even if one
pixel is a blank pixel (see @ref{Blank pixels}), all the pixels in the
frequency domain will also be blank.}.
-If the input (spatial) pixel row is also @mymath{X} pixels wide, then we can
exactly recover the @mymath{x}th pixel with the following summation:
-@dispmath{f_x={1\over X}\displaystyle\sum_{u=0}^{X-1} F_ue^{i{ux\over X}} }
-When the input pixel row (we are still only working on 1D data) has @mymath{X}
pixels, then it is @mymath{L=XP} spatial units wide.
-@mymath{L}, or the length of the input data was defined in @ref{Fourier
series} and @mymath{P} or the space between the pixels in the input was defined
in @ref{Dirac delta and comb}.
-As we saw in @ref{Sampling theorem}, the input (spatial) pixel spacing
(@mymath{P}) specifies the range of frequencies that can be studied and in
@ref{Fourier series} we saw that the length of the (spatial) input,
(@mymath{L}) determines the resolution (or size of the freq-pixels) in our
discrete Fourier transformed image.
-Both result from the fact that the frequency domain is the inverse of the
spatial domain.
-@node Fourier operations in two dimensions, Edges in the frequency domain,
Discrete Fourier transform, Frequency domain and Fourier operations
-@subsubsection Fourier operations in two dimensions
-Once all the relations in the previous sections have been clearly understood
in one dimension, it is very easy to generalize them to two or even more
dimensions since each dimension is by definition independent.
-Previously we defined @mymath{l} as the continuous variable in 1D and the
inverse of the period in its direction to be @mymath{\omega}.
-Let's show the second spatial direction with @mymath{m} the inverse of the
period in the second dimension with @mymath{\nu}.
-The Fourier transform in 2D (see @ref{Fourier transform}) can be written as:
+@node Convolve, Warp, Arithmetic, Data manipulation
+@section Convolve
-@dispmath{F(\omega, \nu)=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}
-f(l, m)e^{-i({\omega}l+{\nu}m)}dl}
+@cindex Convolution
+@cindex Neighborhood
+@cindex Weighted average
+@cindex Average, weighted
+@cindex Kernel, convolution
+On an image, convolution can be thought of as a process to blur or remove the
contrast in an image.
+If you are already familiar with the concept and just want to run Convolve,
you can jump to @ref{Convolution kernel} and @ref{Invoking astconvolve} and
skip the lengthy introduction on the basic definitions and concepts of
convolution.
-@dispmath{f(l, m)=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}
-F(\omega, \nu)e^{i({\omega}l+{\nu}m)}dl}
+There are generally two methods to convolve an image.
+The first and more intuitive one is in the ``spatial domain'' or using the
actual image pixel values, see @ref{Spatial domain convolution}.
+The second method is when we manipulate the ``frequency domain'', or work on
the magnitudes of the different frequencies that constitute the image, see
@ref{Frequency domain and Fourier operations}.
+Understanding convolution in the spatial domain is more intuitive and thus
recommended if you are just starting to learn about convolution.
+However, getting a good grasp of the frequency domain is a little more
involved and needs some concentration and some mathematical proofs.
+However, its reward is a faster operation and more importantly a very
fundamental understanding of this very important operation.
-The 2D Dirac @mymath{\delta(l,m)} is non-zero only when @mymath{l=m=0}.
-The 2D Dirac comb (or Dirac brush! See @ref{Dirac delta and comb}) can be
written in units of the 2D Dirac @mymath{\delta}.
-For most image detectors, the sides of a pixel are equal in both dimensions.
-So @mymath{P} remains unchanged, if a specific device is used which has
non-square pixels, then for each dimension a different value should be used.
+@cindex Detection
+@cindex Atmosphere
+@cindex Blur image
+@cindex Cosmic rays
+@cindex Pixel mixing
+@cindex Mixing pixel values
+Convolution of an image will generally result in blurring the image because it
mixes pixel values.
+In other words, if the image has sharp differences in neighboring pixel
values@footnote{In astronomy, the only major time we confront such sharp
borders in signal are cosmic rays.
+All other sources of signal in an image are already blurred by the atmosphere
or the optics of the instrument.}, those sharp differences will become smoother.
+This has very good consequences in detection of signal in noise for example.
+In an actual observed image, the variation in neighboring pixel values due to
noise can be very high.
+But after convolution, those variations will decrease and we have a better
hope in detecting the possible underlying signal.
+Another case where convolution is extensively used is in mock images and
modeling in general, convolution can be used to simulate the effect of the
atmosphere or the optical system on the mock profiles that we create, see
@ref{PSF}.
+Convolution is a very interesting and important topic in any form of signal
analysis (including astronomical observations).
+So we have thoroughly@footnote{A mathematician will certainly consider this
explanation is incomplete and inaccurate.
+However this text is written for an understanding on the operations that are
done on a real (not complex, discrete and noisy) astronomical image, not any
general form of abstract function} explained the concepts behind it in the
following sub-sections.
-@dispmath{{\rm III}_P(l, m)\equiv\displaystyle\sum_{j=-\infty}^{\infty}
-\displaystyle\sum_{k=-\infty}^{\infty}
-\delta(l-jP, m-kP) }
+@menu
+* Spatial domain convolution:: Only using the input image values.
+* Frequency domain and Fourier operations:: Using frequencies in input.
+* Spatial vs. Frequency domain:: When to use which?
+* Convolution kernel:: How to specify the convolution kernel.
+* Invoking astconvolve:: Options and argument to Convolve.
+@end menu
-The Two dimensional Sampling theorem (see @ref{Sampling theorem}) is thus very
easily derived as before since the frequencies in each dimension are
independent.
-Let's take @mymath{\nu_m} as the maximum frequency along the second dimension.
-Therefore the two dimensional sampling theorem says that a 2D band-limited
function can be recovered when the following conditions hold@footnote{If the
pixels are not a square, then each dimension has to use the respective pixel
size, but since most detectors have square pixels, we assume so here too}:
+@node Spatial domain convolution, Frequency domain and Fourier operations,
Convolve, Convolve
+@subsection Spatial domain convolution
-@dispmath{ {2\pi\over P} > 2\omega_m \quad\quad\quad {\rm and}
-\quad\quad\quad {2\pi\over P} > 2\nu_m}
+The pixels in an input image represent different ``spatial'' positions,
therefore when convolution is done only using the actual input pixel values, we
name the process as being done in the ``Spatial domain''.
+In particular this is in contrast to the ``frequency domain'' that we will
discuss later in @ref{Frequency domain and Fourier operations}.
+In the spatial domain (and in realistic situations where the image and the
convolution kernel do not extend to infinity), convolution is the process of
changing the value of one pixel to the @emph{weighted} average of all the
pixels in its @emph{neighborhood}.
-Finally, let's represent the pixel counter on the second dimension in the
spatial and frequency domains with @mymath{y} and @mymath{v} respectively.
-Also let's assume that the input image has @mymath{Y} pixels on the second
dimension.
-Then the two dimensional discrete Fourier transform and its inverse (see
@ref{Discrete Fourier transform}) can be written as:
+The `neighborhood' of each pixel (how many pixels in which direction) and the
`weight' function (how much each neighboring pixel should contribute depending
on its position) are given through a second image which is known as a
``kernel''@footnote{Also known as filter, here we will use `kernel'.}.
-@dispmath{F_{u,v}=\displaystyle\sum_{x=0}^{X-1}\displaystyle\sum_{y=0}^{Y-1}
-f_{x,y}e^{-i({ux\over X}+{vy\over Y})} }
-
-@dispmath{f_{x,y}={1\over
XY}\displaystyle\sum_{u=0}^{X-1}\displaystyle\sum_{v=0}^{Y-1}
-F_{u,v}e^{i({ux\over X}+{vy\over Y})} }
+@menu
+* Convolution process:: More basic explanations.
+* Edges in the spatial domain:: Dealing with the edges of an image.
+@end menu
+@node Convolution process, Edges in the spatial domain, Spatial domain
convolution, Spatial domain convolution
+@subsubsection Convolution process
-@node Edges in the frequency domain, , Fourier operations in two dimensions,
Frequency domain and Fourier operations
-@subsubsection Edges in the frequency domain
+In convolution, the kernel specifies the weight and positions of the neighbors
of each pixel.
+To find the convolved value of a pixel, the central pixel of the kernel is
placed on that pixel.
+The values of each overlapping pixel in the kernel and image are multiplied by
each other and summed for all the kernel pixels.
+To have one pixel in the center, the sides of the convolution kernel have to
be an odd number.
+This process effectively mixes the pixel values of each pixel with its
neighbors, resulting in a blurred image compared to the sharper input image.
-With a good grasp of the frequency domain, we can revisit the problem of
convolution on the image edges, see @ref{Edges in the spatial domain}.
-When we apply the convolution theorem (see @ref{Convolution theorem}) to
convolve an image, we first take the discrete Fourier transforms (DFT,
@ref{Discrete Fourier transform}) of both the input image and the kernel, then
we multiply them with each other and then take the inverse DFT to construct the
convolved image.
-Of course, in order to multiply them with each other in the frequency domain,
the two images have to be the same size, so let's assume that we pad the kernel
(it is usually smaller than the input image) with zero valued pixels in both
dimensions so it becomes the same size as the input image before the DFT.
+@cindex Linear spatial filtering
+Formally, convolution is one kind of linear `spatial filtering' in image
processing texts.
+If we assume that the kernel has @mymath{2a+1} and @mymath{2b+1} pixels on
each side, the convolved value of a pixel placed at @mymath{x} and @mymath{y}
(@mymath{C_{x,y}}) can be calculated from the neighboring pixel values in the
input image (@mymath{I}) and the kernel (@mymath{K}) from
-Having multiplied the two DFTs, we now apply the inverse DFT which is where
the problem is usually created.
-If the DFT of the kernel only had values of 1 (unrealistic condition!) then
there would be no problem and the inverse DFT of the multiplication would be
identical with the input.
-However in real situations, the kernel's DFT has a maximum of 1 (because the
sum of the kernel has to be one, see @ref{Convolution process}) and decreases
something like the hypothetical profile of @ref{samplingfreq}.
-So when multiplied with the input image's DFT, the coefficients or magnitudes
(see @ref{Circles and the complex plane}) of the smallest frequency (or the sum
of the input image pixels) remains unchanged, while the magnitudes of the
higher frequencies are significantly reduced.
+@dispmath{C_{x,y}=\sum_{s=-a}^{a}\sum_{t=-b}^{b}K_{s,t}\times{}I_{x+s,y+t}.}
-As we saw in @ref{Sampling theorem}, the Fourier transform of a discrete input
will be infinitely repeated.
-In the final inverse DFT step, the input is in the frequency domain (the
multiplied DFT of the input image and the kernel DFT).
-So the result (our output convolved image) will be infinitely repeated in the
spatial domain.
-In order to accurately reconstruct the input image, we need all the
frequencies with the correct magnitudes.
-However, when the magnitudes of higher frequencies are decreased, longer
periods (shorter frequencies) will dominate in the reconstructed pixel values.
-Therefore, when constructing a pixel on the edge of the image, the newly
empowered longer periods will look beyond the input image edges and will find
the repeated input image there.
-So if you convolve an image in this fashion using the convolution theorem,
when a bright object exists on one edge of the image, its blurred wings will be
present on the other side of the convolved image.
-This is often termed as circular convolution or cyclic convolution.
+@cindex Correlation
+@cindex Convolution
+Formally, any pixel that is outside of the image in the equation above will be
considered to be zero (although, see @ref{Edges in the spatial domain}).
+When the kernel is symmetric about its center the blurred image has the same
orientation as the original image.
+However, if the kernel is not symmetric, the image will be affected in the
opposite manner, this is a natural consequence of the definition of spatial
filtering.
+In order to avoid this we can rotate the kernel about its center by 180
degrees so the convolved output can have the same original orientation (this is
done by default in the Convolve program).
+Technically speaking, only if the kernel is flipped the process is known as
@emph{Convolution}.
+If it is not it is known as @emph{Correlation}.
-So, as long as we are dealing with convolution in the frequency domain, there
is nothing we can do about the image edges.
-The least we can do is to eliminate the ghosts of the other side of the image.
-So, we add zero valued pixels to both the input image and the kernel in both
dimensions so the image that will be convolved has a size equal to the sum of
both images in each dimension.
-Of course, the effect of this zero-padding is that the sides of the output
convolved image will become dark.
-To put it another way, the edges are going to drain the flux from nearby
objects.
-But at least it is consistent across all the edges of the image and is
predictable.
-In Convolve, you can see the padded images when inspecting the frequency
domain convolution steps with the @option{--viewfreqsteps} option.
+To be a weighted average, the sum of the weights (the pixels in the kernel)
has to be unity.
+This will have the consequence that the convolved image of an object and
unconvolved object will have the same brightness (see @ref{Brightness flux
magnitude}), which is natural, because convolution should not eat up the object
photons, it only disperses them.
+The convolution of each pixel is independent of the other pixels, and in some
cases, it may be necessary to convolve different parts of an image separately
(for example, when you have different amplifiers on the CCD).
+Therefore, to speed up spatial convolution, Gnuastro first defines a
tessellation over the input; assigning each group of pixels to ``tiles''.
+It then does the convolution in parallel on each tile.
+For more on how Gnuastro's programs create the tile grid (tessellation), see
@ref{Tessellation}.
-@node Spatial vs. Frequency domain, Convolution kernel, Frequency domain and
Fourier operations, Convolve
-@subsection Spatial vs. Frequency domain
-With the discussions above it might not be clear when to choose the spatial
domain and when to choose the frequency domain.
-Here we will try to list the benefits of each.
-@noindent
-The spatial domain,
-@itemize
-@item
-Can correct for the edge effects of convolution, see @ref{Edges in the spatial
domain}.
+@node Edges in the spatial domain, , Convolution process, Spatial domain
convolution
+@subsubsection Edges in the spatial domain
-@item
-Can operate on blank pixels.
+In purely `linear' spatial filtering (convolution), there are problems with
the edges of the input image.
+Here we will explain the problem in the spatial domain.
+For a discussion of this problem from the frequency domain perspective, see
@ref{Edges in the frequency domain}.
+The problem originates from the fact that on the edges, in practice, the sum
of the weights we use on the actual image pixels is not unity@footnote{Because
we assumed the overlapping pixels outside the input image have a value of
zero.}.
+For example, as discussed above, a profile in the center of an image will have
the same brightness before and after convolution.
+However, for partially imaged profile on the edge of the image, the brightness
(sum of its pixel fluxes within the image, see @ref{Brightness flux magnitude})
will not be equal, some of the flux is going to be `eaten' by the edges.
-@item
-Can be faster than frequency domain when the kernel is small (in terms of the
number of pixels on the sides).
-@end itemize
+If you run @command{$ make check} on the source files of Gnuastro, you can see
this effect by comparing the @file{convolve_frequency.fits} with
@file{convolve_spatial.fits} in the @file{./tests/} directory.
+In the spatial domain, by default, no assumption will be made about pixels
outside of the image or any blank pixels in the image.
+The problem explained above will also occur on the sides of blank regions (see
@ref{Blank pixels}).
+The solution to this edge effect problem is only possible in the spatial
domain.
+For pixels near the edge, we have to abandon the assumption that the sum of
the kernel pixels is unity during the convolution process@footnote{Of course
the sum of the kernel pixels still have to be unity in general.}.
+So taking @mymath{W} as the sum of the kernel pixels that overlapped with
non-blank and in-image pixels, the equation in @ref{Convolution process} will
become:
-@noindent
-The frequency domain,
-@itemize
-@item
-Will be much faster when the image and kernel are both large.
-@end itemize
+@dispmath{C_{x,y}= { \sum_{s=-a}^{a}\sum_{t=-b}^{b}K_{s,t}\times{}I_{x+s,y+t}
\over W}.}
@noindent
-As a general rule of thumb, when working on an image of modeled profiles use
the frequency domain and when working on an image of real (observed) objects
use the spatial domain (corrected for the edges).
-The reason is that if you apply a frequency domain convolution to a real
image, you are going to loose information on the edges and generally you do not
want large kernels.
-But when you have made the profiles in the image yourself, you can just make a
larger input image and crop the central parts to completely remove the edge
effect, see @ref{If convolving afterwards}.
-Also due to oversampling, both the kernels and the images can become very
large and the speed boost of frequency domain convolution will significantly
improve the processing time, see @ref{Oversampling}.
+In this manner, objects which are near the edges of the image or blank pixels
will also have the same brightness (within the image) before and after
convolution.
+This correction is applied by default in Convolve when convolving in the
spatial domain.
+To disable it, you can use the @option{--noedgecorrection} option.
+In the frequency domain, there is no way to avoid this loss of flux near the
edges of the image, see @ref{Edges in the frequency domain} for an
interpretation from the frequency domain perspective.
-@node Convolution kernel, Invoking astconvolve, Spatial vs. Frequency domain,
Convolve
-@subsection Convolution kernel
+Note that the edge effect discussed here is different from the one in @ref{If
convolving afterwards}.
+In making mock images we want to simulate a real observation.
+In a real observation, the images of the galaxies on the sides of the CCD are
first blurred by the atmosphere and instrument, then imaged.
+So light from the parts of a galaxy which are immediately outside the CCD will
affect the parts of the galaxy which are covered by the CCD.
+Therefore in modeling the observation, we have to convolve an image that is
larger than the input image by exactly half of the convolution kernel.
+We can hence conclude that this correction for the edges is only useful when
working on actual observed images (where we do not have any more data on the
edges) and not in modeling.
-All the programs that need convolution will need to be given a convolution
kernel file and extension.
-In most cases (other than Convolve, see @ref{Convolve}) the kernel file name
is optional.
-However, the extension is necessary and must be specified either on the
command-line or at least one of the configuration files (see @ref{Configuration
files}).
-Within Gnuastro, there are two ways to create a kernel image:
-@itemize
-@item
-MakeProfiles: You can use MakeProfiles to create a parametric (based on a
radial function) kernel, see @ref{MakeProfiles}.
-By default MakeProfiles will make the Gaussian and Moffat profiles in a
separate file so you can feed it into any of the programs.
+@node Frequency domain and Fourier operations, Spatial vs. Frequency domain,
Spatial domain convolution, Convolve
+@subsection Frequency domain and Fourier operations
-@item
-ConvertType: You can write your own desired kernel into a text file table and
convert it to a FITS file with ConvertType, see @ref{ConvertType}.
-Just be careful that the kernel has to have an odd number of pixels along its
two axes, see @ref{Convolution process}.
-All the programs that do convolution will normalize the kernel internally, so
if you choose this option, you do not have to worry about normalizing the
kernel.
-Only within Convolve, there is an option to disable normalization, see
@ref{Invoking astconvolve}.
+Getting a good grip on the frequency domain is usually not an easy job! So we
have decided to give the issue a complete review here.
+Convolution in the frequency domain (see @ref{Convolution theorem}) heavily
relies on the concepts of Fourier transform (@ref{Fourier transform}) and
Fourier series (@ref{Fourier series}) so we will be investigating these
important operations first.
+It has become something of a clich@'e for people to say that the Fourier
series ``is a way to represent a (wave-like) function as the sum of simple sine
waves'' (from Wikipedia).
+However, sines themselves are abstract functions, so this statement really
adds no extra layer of physical insight.
-@end itemize
+Before jumping head-first into the equations and proofs, we will begin with a
historical background to see how the importance of frequencies actually roots
in our ancient desire to see everything in terms of circles.
+A short review of how the complex plane should be interpreted is then given.
+Having paved the way with these two basics, we define the Fourier series and
subsequently the Fourier transform.
+The final aim is to explain discrete Fourier transform, however some very
important concepts need to be solidified first: The Dirac comb, convolution
theorem and sampling theorem.
+So each of these topics are explained in their own separate sub-sub-section
before going on to the discrete Fourier transform.
+Finally we revisit (after @ref{Edges in the spatial domain}) the problem of
convolution on the edges, but this time in the frequency domain.
+Understanding the sampling theorem and the discrete Fourier transform is very
important in order to be able to pull out valuable science from the discrete
image pixels.
+Therefore we have included the mathematical proofs and figures so you can have
a clear understanding of these very important concepts.
-@noindent
-The two options to specify a kernel file name and its extension are shown
below.
-These are common between all the programs that will do convolution.
-@table @option
-@item -k FITS
-@itemx --kernel=FITS
-The convolution kernel file name.
-The @code{BITPIX} (data type) value of this file can be any standard type and
it does not necessarily have to be normalized.
-Several operations will be done on the kernel image prior to the program's
processing:
+@menu
+* Fourier series historical background:: Historical background.
+* Circles and the complex plane:: Interpreting complex numbers.
+* Fourier series:: Fourier Series definition.
+* Fourier transform:: Fourier Transform definition.
+* Dirac delta and comb:: Dirac delta and Dirac comb.
+* Convolution theorem:: Derivation of Convolution theorem.
+* Sampling theorem:: Sampling theorem (Nyquist frequency).
+* Discrete Fourier transform:: Derivation and explanation of DFT.
+* Fourier operations in two dimensions:: Extend to 2D images.
+* Edges in the frequency domain:: Interpretation of edge effects.
+@end menu
-@itemize
+@node Fourier series historical background, Circles and the complex plane,
Frequency domain and Fourier operations, Frequency domain and Fourier operations
+@subsubsection Fourier series historical background
+Ever since the ancient times, the circle has been (and still is) the simplest
shape for abstract comprehension.
+All you need is a center point and a radius and you are done.
+All the points on a circle are at a fixed distance from the center.
+However, the moment you try to connect this elegantly simple and beautiful
abstract construct (the circle) with the real world (for example, compute its
area or its circumference), things become really hard (ideally, impossible)
because the irrational number @mymath{\pi} gets involved.
-@item
-It will be converted to floating point type.
+The key to understanding the Fourier series (thus the Fourier transform and
finally the Discrete Fourier Transform) is our ancient desire to express
everything in terms of circles or the most exceptionally simple and elegant
abstract human construct.
+Most people prefer to say the same thing in a more ahistorical manner: to
break a function into sines and cosines.
+As the term ``ancient'' in the previous sentence implies, Jean-Baptiste Joseph
Fourier (1768 -- 1830 A.D.) was not the first person to do this.
+The main reason we know this process by his name today is that he came up with
an ingenious method to find the necessary coefficients (radius of) and
frequencies (``speed'' of rotation on) the circles for any generic (integrable)
function.
-@item
-All blank pixels (see @ref{Blank pixels}) will be set to zero.
+@float Figure,epicycle
-@item
-It will be normalized so the sum of its pixels equal unity.
+@c Since these links are long, we had to write them like this so they do not
+@c jump out of the text width.
+@cindex Qutb al-Din al-Shirazi
+@cindex al-Shirazi, Qutb al-Din
+@image{gnuastro-figures/epicycles, 15.2cm, , Middle ages epicycles along with
two demonstrations of breaking a generic function using epicycles.}
+@caption{Epicycles and the Fourier series.
+Left: A demonstration of Mercury's epicycles relative to the ``center of the
world'' by Qutb al-Din al-Shirazi (1236 -- 1311 A.D.) retrieved
@url{https://commons.wikimedia.org/wiki/File:Ghotb2.jpg, from Wikipedia}.
+@url{https://commons.wikimedia.org/wiki/File:Fourier_series_square_wave_circles_animation.gif,
Middle} and
+Right: How adding more epicycles (or terms in the Fourier series) will
approximate functions.
+The
@url{https://commons.wikimedia.org/wiki/File:Fourier_series_sawtooth_wave_circles_animation.gif,
right} animation is also available.}
+@end float
-@item
-It will be flipped so the convolved image has the same orientation.
-This is only relevant if the kernel is not circular. See @ref{Convolution
process}.
-@end itemize
+Like most aspects of mathematics, this process of interpreting everything in
terms of circles, began for astronomical purposes.
+When astronomers noticed that the orbit of Mars and other outer planets, did
not appear to be a simple circle (as everything should have been in the
heavens).
+At some point during their orbit, the revolution of these planets would become
slower, stop, go back a little (in what is known as the retrograde motion) and
then continue going forward again.
-@item -U STR
-@itemx --khdu=STR
-The convolution kernel HDU.
-Although the kernel file name is optional, before running any of the programs,
they need to have a value for @option{--khdu} even if the default kernel is to
be used.
-So be sure to keep its value in at least one of the configuration files (see
@ref{Configuration files}).
-By default, the system configuration file has a value.
+The correction proposed by Ptolemy (90 -- 168 A.D.) was the most agreed upon.
+He put the planets on Epicycles or circles whose center itself rotates on a
circle whose center is the earth.
+Eventually, as observations became more and more precise, it was necessary to
add more and more epicycles in order to explain the complex motions of the
planets@footnote{See the Wikipedia page on ``Deferent and epicycle'' for a more
complete historical review.}.
+@ref{epicycle}(Left) shows an example depiction of the epicycles of Mercury in
the late 13th century.
-@end table
+@cindex Aristarchus of Samos
+Of course we now know that if they had abdicated the Earth from its throne in
the center of the heavens and allowed the Sun to take its place, everything
would become much simpler and true.
+But there was not enough observational evidence for changing the
``professional consensus'' of the time to this radical view suggested by a
small minority@footnote{Aristarchus of Samos (310 -- 230 B.C.) appears to be
one of the first people to suggest the Sun being in the center of the universe.
+This approach to science (that the standard model is defined by consensus) and
the fact that this consensus might be completely wrong still applies equally
well to our models of particle physics and cosmology today.}.
+So the pre-Galilean astronomers chose to keep Earth in the center and find a
correction to the models (while keeping the heavens a purely ``circular''
order).
+The main reason we are giving this historical background which might appear
off topic is to give historical evidence that while such ``approximations'' do
work and are very useful for pragmatic reasons (like measuring the calendar
from the movement of astronomical bodies).
+They offer no physical insight.
+The astronomers who were involved with the Ptolemaic world view had to add a
huge number of epicycles during the centuries after Ptolemy in order to explain
more accurate observations.
+Finally the death knell of this world-view was Galileo's observations with his
new instrument (the telescope).
+So the physical insight, which is what Astronomers and Physicists are
interested in (as opposed to Mathematicians and Engineers who just like proving
and optimizing or calculating!) comes from being creative and not limiting
ourselves to such approximations.
+Even when they work.
-@node Invoking astconvolve, , Convolution kernel, Convolve
-@subsection Invoking Convolve
+@node Circles and the complex plane, Fourier series, Fourier series historical
background, Frequency domain and Fourier operations
+@subsubsection Circles and the complex plane
+Before going onto the derivation, it is also useful to review how the complex
numbers and their plane relate to the circles we talked about above.
+The two schematics in the middle and right of @ref{epicycle} show how a 1D
function of time can be made using the 2D real and imaginary surface.
+Seeing the animation in Wikipedia will really help in understanding this
important concept.
+At each point in time, we take the vertical coordinate of the point and use it
to find the value of the function at that point in time.
+@ref{iandtime} shows this relation with the axes marked.
-Convolve an input dataset (2D image or 1D spectrum for example) with a known
kernel, or make the kernel necessary to match two PSFs.
-The general template for Convolve is:
+@cindex Roger Cotes
+@cindex Cotes, Roger
+@cindex Caspar Wessel
+@cindex Wassel, Caspar
+@cindex Leonhard Euler
+@cindex Euler, Leonhard
+@cindex Abraham de Moivre
+@cindex de Moivre, Abraham
+Leonhard Euler@footnote{Other forms of this equation were known before Euler.
+For example, in 1707 A.D. (the year of Euler's birth) Abraham de Moivre (1667
-- 1754 A.D.) showed that @mymath{(\cos{x}+i\sin{x})^n=\cos(nx)+i\sin(nx)}.
+In 1714 A.D., Roger Cotes (1682 -- 1716 A.D. a colleague of Newton who
proofread the second edition of Principia) showed that:
@mymath{ix=\ln(\cos{x}+i\sin{x})}.} (1707 -- 1783 A.D.) showed that the
complex exponential (@mymath{e^{iv}} where @mymath{v} is real) is periodic and
can be written as: @mymath{e^{iv}=\cos{v}+isin{v}}.
+Therefore @mymath{e^{iv+2\pi}=e^{iv}}.
+Later, Caspar Wessel (mathematician and cartographer 1745 -- 1818 A.D.)
showed how complex numbers can be displayed as vectors on a plane.
+Euler's identity might seem counter intuitive at first, so we will try to
explain it geometrically (for deeper physical insight).
+On the real-imaginary 2D plane (like the left hand plot in each box of
@ref{iandtime}), multiplying a number by @mymath{i} can be interpreted as
rotating the point by @mymath{90} degrees (for example, the value @mymath{3} on
the real axis becomes @mymath{3i} on the imaginary axis).
+On the other hand, @mymath{e\equiv\lim_{n\rightarrow\infty}(1+{1\over n})^n},
therefore, defining @mymath{m\equiv nu}, we get:
-@example
-$ astconvolve [OPTION...] ASTRdata
-@end example
+@dispmath{e^{u}=\lim_{n\rightarrow\infty}\left(1+{1\over n}\right)^{nu}
+ =\lim_{n\rightarrow\infty}\left(1+{u\over nu}\right)^{nu}
+ =\lim_{m\rightarrow\infty}\left(1+{u\over m}\right)^{m}}
@noindent
-One line examples:
+Taking @mymath{u\equiv iv} the result can be written as a generic complex
number (a function of @mymath{v}):
-@example
-## Convolve mockimg.fits with psf.fits:
-$ astconvolve --kernel=psf.fits mockimg.fits
+@dispmath{e^{iv}=\lim_{m\rightarrow\infty}\left(1+i{v\over
+ m}\right)^{m}=a(v)+ib(v)}
-## Convolve in the spatial domain:
-$ astconvolve observedimg.fits --kernel=psf.fits --domain=spatial
+@noindent
+For @mymath{v=\pi}, a nice geometric animation of going to the limit can be
seen @url{https://commons.wikimedia.org/wiki/File:ExpIPi.gif, on Wikipedia}.
+We see that @mymath{\lim_{m\rightarrow\infty}a(\pi)=-1}, while
@mymath{\lim_{m\rightarrow\infty}b(\pi)=0}, which gives the famous
@mymath{e^{i\pi}=-1} equation.
+The final value is the real number @mymath{-1}, however the distance of the
polygon points traversed as @mymath{m\rightarrow\infty} is half the
circumference of a circle or @mymath{\pi}, showing how @mymath{v} in the
equation above can be interpreted as an angle in units of radians and therefore
how @mymath{a(v)=cos(v)} and @mymath{b(v)=sin(v)}.
-## Convolve a 3D cube (only spatial domain is supported in 3D).
-## It is also necessary to define 3D tiles and channels for
-## parallelization (see the Tessellation section for more).
-$ astconvolve cube.fits --kernel=kernel3d.fits --domain=spatial \
- --tilesize=30,30,30 --numchannels=1,1,1
-
-## Find the kernel to match sharper and blurry PSF images (they both
-## have to have the same pixel size).
-$ astconvolve --kernel=sharperimage.fits --makekernel=10 \
- blurryimage.fits
+Since @mymath{e^{iv}} is periodic (let's assume with a period of @mymath{T}),
it is more clear to write it as @mymath{v\equiv{2{\pi}n\over T}t} (where
@mymath{n} is an integer), so @mymath{e^{iv}=e^{i{2{\pi}n\over T}t}}.
+The advantage of this notation is that the period (@mymath{T}) is clearly
visible and the frequency (@mymath{2{\pi}n \over T}, in units of 1/cycle) is
defined through the integer @mymath{n}.
+In this notation, @mymath{t} is in units of ``cycle''s.
-## Convolve a Spectrum (column 14 in the FITS table below) with a
-## custom kernel (the kernel will be normalized internally, so only
-## the ratios are important). Sed is used to replace the spaces with
-## new line characters so Convolve sees them as values in one column.
-$ echo "1 3 10 3 1" | sed 's/ /\n/g' | astconvolve spectra.fits -c14
-@end example
+As we see from the examples in @ref{epicycle} and @ref{iandtime}, for each
constituting frequency, we need a respective `magnitude' or the radius of the
circle in order to accurately approximate the desired 1D function.
+The concepts of ``period'' and ``frequency'' are relatively easy to grasp when
using temporal units like time because this is how we define them in every-day
life.
+However, in an image (astronomical data), we are dealing with spatial units
like distance.
+Therefore, by one ``period'' we mean the @emph{distance} at which the signal
is identical and frequency is defined as the inverse of that spatial ``period''.
+The complex circle of @ref{iandtime} can be thought of the Moon rotating about
Earth which is rotating around the Sun; so the ``Real (signal)'' axis shows the
Moon's position as seen by a distant observer on the Sun as time goes by.
+Because of the scalar (not having any direction or vector) nature of time,
@ref{iandtime} is easier to understand in units of time.
+When thinking about spatial units, mentally replace the ``Time (sec)'' axis
with ``Distance (meters)''.
+Because length has direction and is a vector, visualizing the rotation of the
imaginary circle and the advance along the ``Distance (meters)'' axis is not as
simple as temporal units like time.
-The only argument accepted by Convolve is an input image file.
-Some of the options are the same between Convolve and some other Gnuastro
programs.
-Therefore, to avoid repetition, they will not be repeated here.
-For the full list of options shared by all Gnuastro programs, please see
@ref{Common options}.
-In particular, in the spatial domain, on a multi-dimensional datasets,
convolve uses Gnuastro's tessellation to speed up the run, see
@ref{Tessellation}.
-Common options related to tessellation are described in @ref{Processing
options}.
+@float Figure,iandtime
+@image{gnuastro-figures/iandtime, 15.2cm, , }
+@caption{Relation between the real (signal), imaginary
(@mymath{i\equiv\sqrt{-1}}) and time axes at two snapshots of time.}
+@end float
-1-dimensional datasets (for example, spectra) are only read as columns within
a table (see @ref{Tables} for more on how Gnuastro programs read tables).
-Note that currently 1D convolution is only implemented in the spatial domain
and thus kernel-matching is also not supported.
-Here we will only explain the options particular to Convolve.
-Run Convolve with @option{--help} in order to see the full list of options
Convolve accepts, irrespective of where they are explained in this book.
-@table @option
-@item --kernelcolumn
-Column containing the 1D kernel.
-When the input dataset is a 1-dimensional column, and the host table has more
than one column, use this option to specify which column should be used.
+@node Fourier series, Fourier transform, Circles and the complex plane,
Frequency domain and Fourier operations
+@subsubsection Fourier series
+In astronomical images, our variable (brightness, or number of
photo-electrons, or signal to be more generic) is recorded over the 2D spatial
surface of a camera pixel.
+However to make things easier to understand, here we will assume that the
signal is recorded in 1D (assume one row of the 2D image pixels).
+Also for this section and the next (@ref{Fourier transform}) we will be
talking about the signal before it is digitized or pixelated.
+Let's assume that we have the continuous function @mymath{f(l)} which is
integrable in the interval @mymath{[l_0, l_0+L]} (always true in practical
cases like images).
+Take @mymath{l_0} as the position of the first pixel in the assumed row of the
image and @mymath{L} as the width of the image along that row.
+The units of @mymath{l_0} and @mymath{L} can be in any spatial units (for
example, meters) or an angular unit (like radians) multiplied by a fixed
distance which is more common.
-@item --nokernelflip
-Do not flip the kernel after reading; only for spatial domain convolution.
-This can be useful if the flipping has already been applied to the kernel.
-By default, the input kernel is flipped to avoid the output getting flipped;
see @ref{Convolution process}.
+To approximate @mymath{f(l)} over this interval, we need to find a set of
frequencies and their corresponding `magnitude's (see @ref{Circles and the
complex plane}).
+Therefore our aim is to show @mymath{f(l)} as the following sum of periodic
functions:
-@item --nokernelnorm
-Do not normalize the kernel after reading it, such that the sum of its pixels
is unity.
-As described in @ref{Convolution process}, the kernel is normalized by default.
-@item -d STR
-@itemx --domain=STR
-@cindex Discrete Fourier transform
-The domain to use for the convolution.
-The acceptable values are `@code{spatial}' and `@code{frequency}',
corresponding to the respective domain.
+@dispmath{
+f(l)=\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}n\over L}l} }
-For large images, the frequency domain process will be more efficient than
convolving in the spatial domain.
-However, the edges of the image will loose some flux (see @ref{Edges in the
spatial domain}) and the image must not contain any blank pixels, see
@ref{Spatial vs. Frequency domain}.
+@noindent
+Note that the different frequencies (@mymath{2{\pi}n/L}, in units of cycles
per meters for example) are not arbitrary.
+They are all integer multiples of the fundamental frequency of
@mymath{\omega_0=2\pi/L}.
+Recall that @mymath{L} was the length of the signal we want to model.
+Therefore, we see that the smallest possible frequency (or the frequency
resolution) in the end, depends on the length we observed the signal or
@mymath{L}.
+In the case of each dimension on an image, this is the size of the image in
the respective dimension.
+The frequencies have been defined in this ``harmonic'' fashion to insure that
the final sum is periodic outside of the @mymath{[l_0, l_0+L]} interval too.
+At this point, you might be thinking that the sky is not periodic with the
same period as my camera's view angle.
+You are absolutely right! The important thing is that since your camera's
observed region is the only region we are ``observing'' and will be using, the
rest of the sky is irrelevant; so we can safely assume the sky is periodic
outside of it.
+However, this working assumption will haunt us later in @ref{Edges in the
frequency domain}.
+The frequencies are thus determined by definition.
+So all we need to do is to find the coefficients (@mymath{c_n}), or
magnitudes, or radii of the circles for each frequency which is identified with
the integer @mymath{n}.
+Fourier's approach was to multiply both sides with a fixed term:
-@item --checkfreqsteps
-With this option a file with the initial name of the output file will be
created that is suffixed with @file{_freqsteps.fits}, all the steps done to
arrive at the final convolved image are saved as extensions in this file.
-The extensions in order are:
+@dispmath{
+f(l)e^{-i{2{\pi}m\over
L}l}=\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}(n-m)\over L}l}
+}
-@enumerate
-@item
-The padded input image.
-In frequency domain convolution the two images (input and convolved) have to
be the same size and both should be padded by zeros.
+@noindent
+where @mymath{m>0}@footnote{ We could have assumed @mymath{m<0} and set the
exponential to positive, but this is more clear.}.
+We can then integrate both sides over the observation period:
-@item
-The padded kernel, similar to the above.
+@dispmath{
+\int_{l_0}^{l_0+L}f(l)e^{-i{2{\pi}m\over L}l}dl
+=\int_{l_0}^{l_0+L}\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}(n-m)\over
L}l}dl=\displaystyle\sum_{n=-\infty}^{\infty}c_n\int_{l_0}^{l_0+L}e^{i{2{\pi}(n-m)\over
L}l}dl
+}
-@item
-@cindex Phase angle
-@cindex Complex numbers
-@cindex Numbers, complex
-@cindex Fourier spectrum
-@cindex Spectrum, Fourier
-The Fourier spectrum of the forward Fourier transform of the input image.
-Note that the Fourier transform is a complex operation (and not view able in
one image!) So we either have to show the `Fourier spectrum' or the `Phase
angle'.
-For the complex number @mymath{a+ib}, the Fourier spectrum is defined as
@mymath{\sqrt{a^2+b^2}} while the phase angle is defined as
@mymath{\arctan(b/a)}.
+@noindent
+Both @mymath{n} and @mymath{m} are positive integers.
+Also, we know that a complex exponential is periodic so after one period
(@mymath{L}) it comes back to its starting point.
+Therefore @mymath{\int_{l_0}^{l_0+L}e^{2{\pi}k/L}dl=0} for any @mymath{k>0}.
+However, when @mymath{k=0}, this integral becomes:
@mymath{\int_{l_0}^{l_0+T}e^0dt=\int_{l_0}^{l_0+T}dt=T}.
+Hence since the integral will be zero for all @mymath{n{\neq}m}, we get:
-@item
-The Fourier spectrum of the forward Fourier transform of the kernel image.
+@dispmath{
+\displaystyle\sum_{n=-\infty}^{\infty}c_n\int_{l_0}^{l_0+T}e^{i{2{\pi}(n-m)\over
L}l}dl=Lc_m }
-@item
-The Fourier spectrum of the multiplied (through complex arithmetic)
transformed images.
+@noindent
+The origin of the axis is fundamentally an arbitrary position.
+So let's set it to the start of the image such that @mymath{l_0=0}.
+So we can find the ``magnitude'' of the frequency @mymath{2{\pi}m/L} within
@mymath{f(l)} through the relation:
-@item
-@cindex Round-off error
-@cindex Floating point round-off error
-@cindex Error, floating point round-off
-The inverse Fourier transform of the multiplied image.
-If you open it, you will see that the convolved image is now in the center,
not on one side of the image as it started with (in the padded image of the
first extension).
-If you are working on a mock image which originally had pixels of precisely
0.0, you will notice that in those parts that your convolved profile(s) did not
convert, the values are now @mymath{\sim10^{-18}}, this is due to
floating-point round off errors.
-Therefore in the final step (when cropping the central parts of the image), we
also remove any pixel with a value less than @mymath{10^{-17}}.
-@end enumerate
+@dispmath{ c_m={1\over L}\int_{0}^{L}f(l)e^{-i{2{\pi}m\over L}l}dl }
-@item --noedgecorrection
-Do not correct the edge effect in spatial domain convolution (this correction
is done in spatial domain convolution by default).
-For a full discussion, please see @ref{Edges in the spatial domain}.
-@item -m INT
-@itemx --makekernel=INT
-If this option is called, Convolve will do PSF-matching: the output will be
the kernel that you should convolve with the sharper image to obtain the blurry
one (see @ref{Convolution theorem}).
-The two images must have the same size (number of pixels).
-This option is not yet supported in 1-dimensional datasets.
-In effect, it is only necessary to give the two PSFs of your two datasets,
find the matching kernel based on that, then apply that kernel to the
higher-resolution (sharper image).
-The image given to the @option{--kernel} option is assumed to be the sharper
(less blurry) image and the input image (with no option) is assumed to be the
more blurry image.
-The value given to this option will be used as the maximum radius of the
kernel.
-Any pixel in the final kernel that is larger than this distance from the
center will be set to zero.
+@node Fourier transform, Dirac delta and comb, Fourier series, Frequency
domain and Fourier operations
+@subsubsection Fourier transform
+In @ref{Fourier series}, we had to assume that the function is periodic
outside of the desired interval with a period of @mymath{L}.
+Therefore, assuming that @mymath{L\rightarrow\infty} will allow us to work
with any function.
+However, with this approximation, the fundamental frequency
(@mymath{\omega_0}) or the frequency resolution that we discussed in
@ref{Fourier series} will tend to zero: @mymath{\omega_0\rightarrow0}.
+In the equation to find @mymath{c_m}, every @mymath{m} represented a frequency
(multiple of @mymath{\omega_0}) and the integration on @mymath{l} removes the
dependence of the right side of the equation on @mymath{l}, making it only a
function of @mymath{m} or frequency.
+Let's define the following two variables:
-Noise has large frequencies which can make the result less reliable for the
higher frequencies of the final result.
-So all the frequencies which have a spectrum smaller than the value given to
the @option{minsharpspec} option in the sharper input image are set to zero and
not divided.
-This will cause the wings of the final kernel to be flatter than they would
ideally be which will make the convolved image result unreliable if it is too
high.
+@dispmath{\omega{\equiv}m\omega_0={2{\pi}m\over L}}
-Some notes to on how to prepare your two input PSFs.
-Note that these (and several other issues that relate to an accurate
estimation of the PSF) are practically described in the following tutorial:
@ref{Building the extended PSF}.
+@dispmath{F(\omega){\equiv}Lc_m}
-@itemize
-@item
-Choose a bright (unsaturated) star and use a region box (with Crop for
example, see @ref{Crop}) that is sufficiently above the noise.
+@noindent
+The equation to find the coefficients of each frequency in
+@ref{Fourier series} thus becomes:
-@item
-Mask all background sources that may be nearby (you can use Segment's clumps,
see @ref{Segment}).
+@dispmath{ F(\omega)=\int_{-\infty}^{\infty}f(l)e^{-i{\omega}l}dl.}
-@item
-Use Warp (see @ref{Warp}) to warp the pixel grid so the star's center is
exactly on the center of the central pixel in the cropped image.
-This will certainly slightly degrade the result, however, it is necessary.
-If there are multiple good stars, you can shift all of them, then normalize
them (so the sum of each star's pixels is one) and then take their average to
decrease this effect.
+@noindent
+The function @mymath{F(\omega)} is thus the @emph{Fourier transform} of
@mymath{f(l)} in the frequency domain.
+So through this transformation, we can find (analyze) the magnitudes of the
constituting frequencies or the value in the frequency space@footnote{As we
discussed before, this `magnitude' can be interpreted as the radius of the
circle rotating at this frequency in the epicyclic interpretation of the
Fourier series, see @ref{epicycle} and @ref{iandtime}.} of our spatial input
function.
+The great thing is that we can also do the reverse and later synthesize the
input function from its Fourier transform.
+Let's do it: with the approximations above, multiply the right side of the
definition of the Fourier Series (@ref{Fourier series}) with
@mymath{1=L/L=({\omega_0}L)/(2\pi)}:
-@item
-The shifting might move the center of the star by one pixel in any direction,
so crop the central pixel of the warped image to have a clean image for the
de-convolution.
-@end itemize
+@dispmath{ f(l)={1\over
+2\pi}\displaystyle\sum_{n=-\infty}^{\infty}Lc_ne^{{2{\pi}in\over
+L}l}\omega_0={1\over
+2\pi}\displaystyle\sum_{n=-\infty}^{\infty}F(\omega)e^{i{\omega}l}\Delta\omega
+}
-@item -c
-@itemx --minsharpspec
-(@option{=FLT}) The minimum frequency spectrum (or coefficient, or pixel value
in the frequency domain image) to use in deconvolution, see the explanations
under the @option{--makekernel} option for more information.
-@end table
+@noindent
+To find the right most side of this equation, we renamed @mymath{\omega_0} as
@mymath{\Delta\omega} because it was our resolution, @mymath{2{\pi}n/L} was
written as @mymath{\omega} and finally, @mymath{Lc_n} was written as
@mymath{F(\omega)} as we defined above.
+Now, as @mymath{L\rightarrow\infty}, @mymath{\Delta\omega\rightarrow0} so we
can write:
+@dispmath{ f(l)={1\over
+ 2\pi}\int_{-\infty}^{\infty}F(\omega)e^{i{\omega}l}d\omega }
+Together, these two equations provide us with a very powerful set of tools
that we can use to process (analyze) and recreate (synthesize) the input signal.
+Through the first equation, we can break up our input function into its
constituent frequencies and analyze it, hence it is also known as
@emph{analysis}.
+Using the second equation, we can synthesize or make the input function from
the known frequencies and their magnitudes.
+Thus it is known as @emph{synthesis}.
+Here, we symbolize the Fourier transform (analysis) and its inverse
(synthesis) of a function @mymath{f(l)} and its Fourier Transform
@mymath{F(\omega)} as @mymath{{\cal F}[f]} and @mymath{{\cal F}^{-1}[F]}.
+@node Dirac delta and comb, Convolution theorem, Fourier transform, Frequency
domain and Fourier operations
+@subsubsection Dirac delta and comb
+The Dirac @mymath{\delta} (delta) function (also known as an impulse) is the
way that we convert a continuous function into a discrete one.
+It is defined to satisfy the following integral:
+@dispmath{\int_{-\infty}^{\infty}\delta(l)dl=1}
+@noindent
+When integrated with another function, it gives that function's value at
@mymath{l=0}:
+@dispmath{\int_{-\infty}^{\infty}f(l)\delta(l)dt=f(0)}
+@noindent
+An impulse positioned at another point (say @mymath{l_0}) is written as
@mymath{\delta(l-l_0)}:
+@dispmath{\int_{-\infty}^{\infty}f(l)\delta(l-l_0)dt=f(l_0)}
-@node Warp, , Convolve, Data manipulation
-@section Warp
-Image warping is the process of mapping the pixels of one image onto a new
pixel grid.
-This process is sometimes known as transformation, however following the
discussion of Heckbert 1989@footnote{Paul S. Heckbert. 1989. @emph{Fundamentals
of Texture mapping and Image Warping}, Master's thesis at University of
California, Berkeley.} we will not be using that term because it can be
confused with only pixel value or flux transformations.
-Here we specifically mean the pixel grid transformation which is better
conveyed with `warp'.
+@noindent
+The Dirac @mymath{\delta} function also operates similarly if we use
summations instead of integrals.
+The Fourier transform of the delta function is:
-@cindex Gravitational lensing
-Image wrapping is a very important step in astronomy, both in observational
data analysis and in simulating modeled images.
-In modeling, warping an image is necessary when we want to apply grid
transformations to the initial models, for example, in simulating gravitational
lensing.
-Observational reasons for warping an image are listed below:
+@dispmath{{\cal
F}[\delta(l)]=\int_{-\infty}^{\infty}\delta(l)e^{-i{\omega}l}dl=e^{-i{\omega}0}=1}
-@itemize
+@dispmath{{\cal
F}[\delta(l-l_0)]=\int_{-\infty}^{\infty}\delta(l-l_0)e^{-i{\omega}l}dl=e^{-i{\omega}l_0}}
-@cindex Signal to noise ratio
-@item
-@strong{Noise:} Most scientifically interesting targets are inherently faint
(have a very low Signal to noise ratio).
-Therefore one short exposure is not enough to detect such objects that are
drowned deeply in the noise.
-We need multiple exposures so we can add them together and increase the
objects' signal to noise ratio.
-Keeping the telescope fixed on one field of the sky is practically impossible.
-Therefore very deep observations have to put into the same grid before adding
them.
+@noindent
+From the definition of the Dirac @mymath{\delta} we can also define a
+Dirac comb (@mymath{{\rm III}_P}) or an impulse train with infinite
+impulses separated by @mymath{P}:
-@cindex Mosaicing
-@cindex Image mosaic
-@item
-@strong{Resolution:} If we have multiple images of one patch of the sky
(hopefully at multiple orientations) we can warp them to the same grid.
-The multiple orientations will allow us to `guess' the values of pixels on an
output pixel grid that has smaller pixel sizes and thus increase the resolution
of the output.
-This process of merging multiple observations is known as Mosaicing.
+@dispmath{
+{\rm III}_P(l)\equiv\displaystyle\sum_{k=-\infty}^{\infty}\delta(l-kP) }
-@cindex Cosmic rays
-@item
-@strong{Cosmic rays:} Cosmic rays can randomly fall on any part of an image.
-If they collide vertically with the camera, they are going to create a very
sharp and bright spot that in most cases can be separated easily@footnote{All
astronomical targets are blurred with the PSF, see @ref{PSF}, however a cosmic
ray is not and so it is very sharp (it suddenly stops at one pixel).}.
-However, depending on the depth of the camera pixels, and the angle that a
cosmic rays collides with it, it can cover a line-like larger area on the CCD
which makes the detection using their sharp edges very hard and error prone.
-One of the best methods to remove cosmic rays is to compare multiple images of
the same field.
-To do that, we need all the images to be on the same pixel grid.
-@cindex Optical distortion
-@cindex Distortion, optical
-@item
-@strong{Optical distortion:} In wide field images, the optical distortion that
occurs on the outer parts of the focal plane will make accurate comparison of
the objects at various locations impossible.
-It is therefore necessary to warp the image and correct for those distortions
prior to the analysis.
+@noindent
+@mymath{P} is chosen to represent ``pixel width'' later in @ref{Sampling
theorem}.
+Therefore the Dirac comb is periodic with a period of @mymath{P}.
+We have intentionally used a different name for the period of the Dirac comb
compared to the input signal's length of observation that we showed with
@mymath{L} in @ref{Fourier series}.
+This difference is highlighted here to avoid confusion later when these two
periods are needed together in @ref{Discrete Fourier transform}.
+The Fourier transform of the Dirac comb will be necessary in @ref{Sampling
theorem}, so let's derive it.
+By its definition, it is periodic, with a period of @mymath{P}, so the Fourier
coefficients of its Fourier Series (@ref{Fourier series}) can be calculated
within one period:
-@cindex ACS
-@cindex CCD
-@cindex WFC3
-@cindex Wide Field Camera 3
-@cindex Charge-coupled device
-@cindex Advanced camera for surveys
-@cindex Hubble Space Telescope (HST)
-@item
-@strong{Detector not on focal plane:} In some cases (like the Hubble Space
Telescope ACS and WFC3 cameras), the CCD might be tilted compared to the focal
plane, therefore the recorded CCD pixels have to be projected onto the focal
plane before further analysis.
+@dispmath{{\rm
III}_P=\displaystyle\sum_{n=-\infty}^{\infty}c_ne^{i{2{\pi}n\over
+P}l}}
-@end itemize
+@noindent
+We can now find the @mymath{c_n} from @ref{Fourier series}:
-@menu
-* Linear warping basics:: Basics of coordinate transformation.
-* Merging multiple warpings:: How to merge multiple matrices.
-* Resampling:: Warping an image is re-sampling it.
-* Moire pattern and its correction:: Spatial resonance of the grid pattern on
output.
-* Invoking astwarp:: Arguments and options for Warp.
-@end menu
+@dispmath{
+c_n={1\over P}\int_{-P/2}^{P/2}\delta(l)e^{-i{2{\pi}n\over P}l}
+={1\over P}\quad\quad \rightarrow \quad\quad
+{\rm III}_P={1\over P}\displaystyle\sum_{n=-\infty}^{\infty}e^{i{2{\pi}n\over
P}l}
+}
-@node Linear warping basics, Merging multiple warpings, Warp, Warp
-@subsection Linear warping basics
+@noindent
+So we can write the Fourier transform of the Dirac comb as:
-@cindex Scaling
-@cindex Coordinate transformation
-Let's take @mymath{\left[\matrix{u&v}\right]} as the coordinates of a point in
the input image and @mymath{\left[\matrix{x&y}\right]} as the coordinates of
that same point in the output image@footnote{These can be any real number, we
are not necessarily talking about integer pixels here.}.
-The simplest form of coordinate transformation (or warping) is the scaling of
the coordinates, let's assume we want to scale the first axis by @mymath{M} and
the second by @mymath{N}, the output coordinates of that point can be
calculated by
+@dispmath{
+{\cal F}[{\rm III}_P]=\int_{-\infty}^{\infty}{\rm III}_Pe^{-i{\omega}l}dl
+={1\over
P}\displaystyle\sum_{n=-\infty}^{\infty}\int_{-\infty}^{\infty}e^{-i(\omega-{2{\pi}n\over
P})l}dl={1\over
P}\displaystyle\sum_{n=-\infty}^{\infty}\delta\left(\omega-{2{\pi}n\over
P}\right)
+}
-@dispmath{\left[\matrix{x\cr y}\right]=
- \left[\matrix{Mu\cr Nv}\right]=
- \left[\matrix{M&0\cr0&N}\right]\left[\matrix{u\cr v}\right]}
-@cindex Matrix
-@cindex Multiplication, Matrix
-@cindex Rotation of coordinates
@noindent
-Note that these are matrix multiplications.
-We thus see that we can represent any such grid warping as a matrix.
-Another thing we can do with this @mymath{2\times2} matrix is to rotate the
output coordinate around the common center of both coordinates.
-If the output is rotated anticlockwise by @mymath{\theta} degrees from the
positive (to the right) horizontal axis, then the warping matrix should become:
+In the last step, we used the fact that the complex exponential is a periodic
function, that @mymath{n} is an integer and that as we defined in @ref{Fourier
transform}, @mymath{\omega{\equiv}m\omega_0}, where @mymath{m} was an integer.
+The integral will be zero for any @mymath{\omega} that is not equal to
@mymath{2{\pi}n/P}, a more complete explanation can be seen in @ref{Fourier
series}.
+Therefore, while in the spatial domain the impulses had spacing of @mymath{P}
(meters for example), in the frequency space, the spacing between the different
impulses are @mymath{2\pi/P} cycles per meters.
-@dispmath{\left[\matrix{x\cr y}\right]=
- \left[\matrix{ucos\theta-vsin\theta\cr usin\theta+vcos\theta}\right]=
- \left[\matrix{cos\theta&-sin\theta\cr sin\theta&cos\theta}\right]
- \left[\matrix{u\cr v}\right]
- }
-@cindex Flip coordinates
+@node Convolution theorem, Sampling theorem, Dirac delta and comb, Frequency
domain and Fourier operations
+@subsubsection Convolution theorem
+
+The convolution (shown with the @mymath{\ast} operator) of the two
+functions @mymath{f(l)} and @mymath{h(l)} is defined as:
+
+@dispmath{
+c(l)\equiv[f{\ast}h](l)=\int_{-\infty}^{\infty}f(\tau)h(l-\tau)d\tau
+}
+
@noindent
-We can also flip the coordinates around the first axis, the second axis and
the coordinate center with the following three matrices respectively:
+See @ref{Convolution process} for a more detailed physical (pixel based)
interpretation of this definition.
+The Fourier transform of convolution (@mymath{C(\omega)}) can be written as:
-@dispmath{\left[\matrix{1&0\cr0&-1}\right]\quad\quad
- \left[\matrix{-1&0\cr0&1}\right]\quad\quad
- \left[\matrix{-1&0\cr0&-1}\right]}
+@dispmath{
+ C(\omega)=\int_{-\infty}^{\infty}[f{\ast}h](l)e^{-i{\omega}l}dl=
+
\int_{-\infty}^{\infty}f(\tau)\left[\int_{-\infty}^{\infty}h(l-\tau)e^{-i{\omega}l}dl\right]d\tau
+}
-@cindex Shear
@noindent
-The final thing we can do with this definition of a @mymath{2\times2} warping
matrix is shear.
-If we want the output to be sheared along the first axis with @mymath{A} and
along the second with @mymath{B}, then we can use the matrix:
+To solve the inner integral, let's define @mymath{s{\equiv}l-\tau}, so
+that @mymath{ds=dl} and @mymath{l=s+\tau} then the inner integral
+becomes:
-@dispmath{\left[\matrix{1&A\cr B&1}\right]}
+@dispmath{
+\int_{-\infty}^{\infty}h(l-\tau)e^{-i{\omega}l}dl=
+\int_{-\infty}^{\infty}h(s)e^{-i{\omega}(s+\tau)}ds=e^{-i{\omega}\tau}\int_{-\infty}^{\infty}h(s)e^{-i{\omega}s}ds=H(\omega)e^{-i{\omega}\tau}
+}
@noindent
-To have one matrix representing any combination of these steps, you use matrix
multiplication, see @ref{Merging multiple warpings}.
-So any combinations of these transformations can be displayed with one
@mymath{2\times2} matrix:
+where @mymath{H(\omega)} is the Fourier transform of @mymath{h(l)}.
+Substituting this result for the inner integral above, we get:
-@dispmath{\left[\matrix{a&b\cr c&d}\right]}
+@dispmath{
+C(\omega)=H(\omega)\int_{-\infty}^{\infty}f(\tau)e^{-i{\omega}\tau}d\tau=H(\omega)F(\omega)=F(\omega)H(\omega)
+}
-@cindex Wide Field Camera 3
-@cindex Advanced Camera for Surveys
-@cindex Hubble Space Telescope (HST)
-The transformations above can cover a lot of the needs of most coordinate
transformations.
-However they are limited to mapping the point @mymath{[\matrix{0&0}]} to
@mymath{[\matrix{0&0}]}.
-Therefore they are useless if you want one coordinate to be shifted compared
to the other one.
-They are also space invariant, meaning that all the coordinates in the image
will receive the same transformation.
-In other words, all the pixels in the output image will have the same area if
placed over the input image.
-So transformations which require varying output pixel sizes like projections
cannot be applied through this @mymath{2\times2} matrix either (for example,
for the tilted ACS and WFC3 camera detectors on board the Hubble space
telescope).
+@noindent
+where @mymath{F(\omega)} is the Fourier transform of @mymath{f(l)}.
+So multiplying the Fourier transform of two functions individually, we get the
Fourier transform of their convolution.
+The convolution theorem also proves a relation between the convolutions in the
frequency space.
+Let's define:
-@cindex M@"obius, August. F.
-@cindex Homogeneous coordinates
-@cindex Coordinates, homogeneou
-To add these further capabilities, namely translation and projection, we use
the homogeneous coordinates.
-They were defined about 200 years ago by August Ferdinand M@"obius (1790 --
1868).
-For simplicity, we will only discuss points on a 2D plane and avoid the
complexities of higher dimensions.
-We cannot provide a deep mathematical introduction here, interested readers
can get a more detailed explanation from
Wikipedia@footnote{@url{http://en.wikipedia.org/wiki/Homogeneous_coordinates}}
and the references therein.
+@dispmath{D(\omega){\equiv}F(\omega){\ast}H(\omega)}
-By adding an extra coordinate to a point we can add the flexibility we need.
-The point @mymath{[\matrix{x&y}]} can be represented as
@mymath{[\matrix{xZ&yZ&Z}]} in homogeneous coordinates.
-Therefore multiplying all the coordinates of a point in the homogeneous
coordinates with a constant will give the same point.
-Put another way, the point @mymath{[\matrix{x&y&Z}]} corresponds to the point
@mymath{[\matrix{x/Z&y/Z}]} on the constant @mymath{Z} plane.
-Setting @mymath{Z=1}, we get the input image plane, so
@mymath{[\matrix{u&v&1}]} corresponds to @mymath{[\matrix{u&v}]}.
-With this definition, the transformations above can be generally written as:
+@noindent
+Applying the inverse Fourier Transform or synthesis equation (@ref{Fourier
transform}) to both sides and following the same steps above, we get:
-@dispmath{\left[\matrix{x\cr y\cr 1}\right]=
- \left[\matrix{a&b&0\cr c&d&0\cr 0&0&1}\right]
- \left[\matrix{u\cr v\cr 1}\right]}
+@dispmath{d(l)=f(l)h(l)}
@noindent
-@cindex Affine Transformation
-@cindex Transformation, affine
-We thus acquired 4 extra degrees of freedom.
-By giving non-zero values to the zero valued elements of the last column we
can have translation (try the matrix multiplication!).
-In general, any coordinate transformation that is represented by the matrix
below is known as an affine
transformation@footnote{@url{http://en.wikipedia.org/wiki/Affine_transformation}}:
+Where @mymath{d(l)} is the inverse Fourier transform of @mymath{D(\omega)}.
+We can therefore re-write the two equations above formally as the convolution
theorem:
-@dispmath{\left[\matrix{a&b&c\cr d&e&f\cr 0&0&1}\right]}
+@dispmath{
+ {\cal F}[f{\ast}h]={\cal F}[f]{\cal F}[h]
+}
-@cindex Homography
-@cindex Projective transformation
-@cindex Transformation, projective
-We can now consider translation, but the affine transform is still spatially
invariant.
-Giving non-zero values to the other two elements in the matrix above gives us
the projective transformation or
Homography@footnote{@url{http://en.wikipedia.org/wiki/Homography}} which is the
most general type of transformation with the @mymath{3\times3} matrix:
+@dispmath{
+ {\cal F}[fh]={\cal F}[f]\ast{\cal F}[h]
+}
-@dispmath{\left[\matrix{x'\cr y'\cr w}\right]=
- \left[\matrix{a&b&c\cr d&e&f\cr g&h&1}\right]
- \left[\matrix{u\cr v\cr 1}\right]}
+Besides its usefulness in blurring an image by convolving it with a given
kernel, the convolution theorem also enables us to do another very useful
operation in data analysis: to match the blur (or PSF) between two images taken
with different telescopes/cameras or under different atmospheric conditions.
+This process is also known as de-convolution.
+Let's take @mymath{f(l)} as the image with a narrower PSF (less blurry) and
@mymath{c(l)} as the image with a wider PSF which appears more blurred.
+Also let's take @mymath{h(l)} to represent the kernel that should be convolved
with the sharper image to create the more blurry image.
+Above, we proved the relation between these three images through the
convolution theorem.
+But there, we assumed that @mymath{f(l)} and @mymath{h(l)} are known (given)
and the convolved image is desired.
-@noindent
-So the output coordinates can be calculated from:
+In de-convolution, we have @mymath{f(l)} --the sharper image-- and
@mymath{f*h(l)} --the more blurry image-- and we want to find the kernel
@mymath{h(l)}.
+The solution is a direct result of the convolution theorem:
-@dispmath{x={x' \over w}={au+bv+c \over gu+hv+1}\quad\quad\quad\quad
- y={y' \over w}={du+ev+f \over gu+hv+1}}
+@dispmath{
+ {\cal F}[h]={{\cal F}[f{\ast}h]\over {\cal F}[f]}
+ \quad\quad
+ {\rm or}
+ \quad\quad
+ h(l)={\cal F}^{-1}\left[{{\cal F}[f{\ast}h]\over {\cal F}[f]}\right]
+}
-Thus with Homography we can change the sizes of the output pixels on the input
plane, giving a `perspective'-like visual impression.
-This can be quantitatively seen in the two equations above.
-When @mymath{g=h=0}, the denominator is independent of @mymath{u} or
@mymath{v} and thus we have spatial invariance.
-Homography preserves lines at all orientations.
-A very useful fact about Homography is that its inverse is also a Homography.
-These two properties play a very important role in the implementation of this
transformation.
-A short but instructive and illustrated review of affine, projective and also
bi-linear mappings is provided in Heckbert 1989@footnote{
-Paul S. Heckbert. 1989. @emph{Fundamentals of Texture mapping and Image
Warping}, Master's thesis at University of California, Berkeley.
-Note that since points are defined as row vectors there, the matrix is the
transpose of the one discussed here.}.
+While this works really nice, it has two problems:
-@node Merging multiple warpings, Resampling, Linear warping basics, Warp
-@subsection Merging multiple warpings
+@itemize
-@cindex Commutative property
-@cindex Matrix multiplication
-@cindex Multiplication, matrix
-@cindex Non-commutative operations
-@cindex Operations, non-commutative
-In @ref{Linear warping basics} we saw how a basic warp/transformation can be
represented with a matrix.
-To make more complex warpings (for example, to define a translation, rotation
and scale as one warp) the individual matrices have to be multiplied through
matrix multiplication.
-However matrix multiplication is not commutative, so the order of the set of
matrices you use for the multiplication is going to be very important.
+@item
+If @mymath{{\cal F}[f]} has any zero values, then the inverse Fourier
transform will not be a number!
-The first warping should be placed as the left-most matrix.
-The second warping to the right of that and so on.
-The second transformation is going to occur on the warped coordinates of the
first.
-As an example for merging a few transforms into one matrix, the multiplication
below represents the rotation of an image about a point @mymath{[\matrix{U&V}]}
anticlockwise from the horizontal axis by an angle of @mymath{\theta}.
-To do this, first we take the origin to @mymath{[\matrix{U&V}]} through
translation.
-Then we rotate the image, then we translate it back to where it was initially.
-These three operations can be merged in one operation by calculating the
matrix multiplication below:
+@item
+If there is significant noise in the image, then the high frequencies of the
noise are going to significantly reduce the quality of the final result.
-@dispmath{\left[\matrix{1&0&U\cr0&1&V\cr{}0&0&1}\right]
- \left[\matrix{cos\theta&-sin\theta&0\cr sin\theta&cos\theta&0\cr
0&0&1}\right]
- \left[\matrix{1&0&-U\cr0&1&-V\cr{}0&0&1}\right]}
+@end itemize
+A standard solution to both these problems is the Weiner de-convolution
+algorithm@footnote{@url{https://en.wikipedia.org/wiki/Wiener_deconvolution}}.
+@node Sampling theorem, Discrete Fourier transform, Convolution theorem,
Frequency domain and Fourier operations
+@subsubsection Sampling theorem
+Our mathematical functions are continuous, however, our data collecting and
measuring tools are discrete.
+Here we want to give a mathematical formulation for digitizing the continuous
mathematical functions so that later, we can retrieve the continuous function
from the digitized recorded input.
+Assuming that we have a continuous function @mymath{f(l)}, then we can define
@mymath{f_s(l)} as the `sampled' @mymath{f(l)} through the Dirac comb (see
@ref{Dirac delta and comb}):
+@dispmath{
+f_s(l)=f(l){\rm III}_P=\displaystyle\sum_{n=-\infty}^{\infty}f(l)\delta(l-nP)
+}
-@node Resampling, Moire pattern and its correction, Merging multiple warpings,
Warp
-@subsection Resampling
+@noindent
+The discrete data-element @mymath{f_k} (for example, a pixel in an
+image), where @mymath{k} is an integer, can thus be represented as:
-@cindex Pixel
-@cindex Camera
-@cindex Detector
-@cindex Sampling
-@cindex Resampling
-@cindex Pixel mixing
-@cindex Photoelectrons
-@cindex Picture element
-@cindex Mixing pixel values
-A digital image is composed of discrete `picture elements' or `pixels'.
-When a real image is created from a camera or detector, each pixel's area is
used to store the number of photo-electrons that were created when incident
photons collided with that pixel's surface area.
-This process is called the `sampling' of a continuous or analog data into
digital data.
+@dispmath{f_k=\int_{-\infty}^{\infty}f_s(l)dl=\int_{-\infty}^{\infty}f(l)\delta(l-kP)dt=f(kP)}
-When we change the pixel grid of an image, or ``warp'' it, we have to
calculate the flux value of each pixel on the new grid based on the old grid,
or resample it.
-Because of the calculation (as opposed to observation), any form of warping on
the data is going to degrade the image and mix the original pixel values with
each other.
-So if an analysis can be done on an unwarped data image, it is best to leave
the image untouched and pursue the analysis.
-However as discussed in @ref{Warp} this is not possible in some scenarios and
re-sampling is necessary.
+Note that in practice, our discrete data points are not found in this fashion.
+Each detector pixel (in an image for example) has an area and averages the
signal it receives over that area, not a mathematical point as the Dirac
@mymath{\delta} function defines.
+However, as long as the variation in the signal over one detector pixel is not
significant, this can be a good approximation.
+Having put this issue to the side, we can now try to find the relation between
the Fourier transforms of the un-sampled @mymath{f(l)} and the sampled
@mymath{f_s(l)}.
+For a more clear notation, let's define:
-@cindex Point pixels
-@cindex Interpolation
-@cindex Sampling theorem
-@cindex Bicubic interpolation
-@cindex Signal to noise ratio
-@cindex Bi-linear interpolation
-@cindex Interpolation, bicubic
-@cindex Interpolation, bi-linear
-When the FWHM of the PSF of the camera is much larger than the pixel scale
(see @ref{Sampling theorem}) we are sampling the signal in a much higher
resolution than the camera can offer.
-This is usually the case in many applications of image processing
(nonastronomical imaging).
-In such cases, we can consider each pixel to be a point and not an area: the
PSF doesn't vary much over a single pixel.
-
-Approximating a pixel's area to a point can significantly speed up the
resampling and also the simplicity of the code.
-Because resampling becomes a problem of interpolation: points of the input
grid need to be interpolated at certain other points (over the output grid).
-To increase the accuracy, you might also sample more than one point from
within a pixel giving you more points for a more accurate interpolation in the
output grid.
+@dispmath{F_s(\omega)\equiv{\cal F}[f_s]}
-@cindex Image edges
-@cindex Edges, image
-However, interpolation has several problems.
-The first one is that it will depend on the type of function you want to
assume for the interpolation.
-For example, you can choose a bi-linear or bi-cubic (the `bi's are for the 2
dimensional nature of the data) interpolation method.
-For the latter there are various ways to set the constants@footnote{see
@url{http://entropymine.com/imageworsener/bicubic/} for a nice introduction.}.
-Such parametric interpolation functions can fail seriously on the edges of an
image, or when there is a sharp change in value (for example, the bleeding
saturation of bright stars in astronomical CCDs).
-They will also need normalization so that the flux of the objects before and
after the warping is comparable.
+@dispmath{D(\omega)\equiv{\cal F}[{\rm III}_P]}
-The parametric nature of these methods adds a level of subjectivity to the
data (it makes more assumptions through the functions than the data can handle).
-For most applications this is fine (as discussed above: when the PSF is
over-sampled), but in scientific applications where we push our instruments to
the limit and the aim is the detection of the faintest possible galaxies or
fainter parts of bright galaxies, we cannot afford this loss.
-Because of these reasons Warp will not use parametric interpolation techniques.
+@noindent
+Then using the Convolution theorem (see @ref{Convolution theorem}),
+@mymath{F_s(\omega)} can be written as:
-@cindex Drizzle
-@cindex Pixel mixing
-@cindex Exact area resampling
-Warp will do interpolation based on ``pixel mixing''@footnote{For a graphic
demonstration see @url{http://entropymine.com/imageworsener/pixelmixing/}.} or
``area resampling''.
-This is also similar to what the Hubble Space Telescope pipeline calls
``Drizzling''@footnote{@url{http://en.wikipedia.org/wiki/Drizzle_(image_processing)}}.
-This technique requires no functions, it is thus non-parametric.
-It is also the closest we can get (make least assumptions) to what actually
happens on the detector pixels.
+@dispmath{F_s(\omega)={\cal F}[f(l){\rm III}_P]=F(\omega){\ast}D(\omega)}
-In pixel mixing, the basic idea is that you reverse-transform each output
pixel to find which pixels of the input image it covers, and what fraction of
the area of the input pixels are covered by that output pixel.
-We then multiply each input pixel's value by the fraction of its area that
overlaps with the output pixel (between 0 to 1).
-The output's pixel value is derived by summing all these multiplications for
the input pixels that it covers.
+@noindent
+Finally, from the definition of convolution and the Fourier transform
+of the Dirac comb (see @ref{Dirac delta and comb}), we get:
-Through this process, pixels are treated as an area not as a point (which is
how detectors create the image), also the brightness (see @ref{Brightness flux
magnitude}) of an object will be fully preserved.
-Since it involves the mixing of the input's pixel values, this pixel mixing
method is a form of @ref{Spatial domain convolution}.
-Therefore, after comparing the input and output, you will notice that the
output is slightly smoothed, thus boosting the more diffuse signal, but
creating correlated noise.
-In astronomical imaging the correlated noise will be decreased later when you
stack many exposures@footnote{If you are working on a single exposure image and
see pronounced Moir@'e patterns after Warping, check @ref{Moire pattern and its
correction} for a possible way to reduce them}.
+@dispmath{
+\eqalign{
+F_s(\omega) &= \int_{-\infty}^{\infty}F(\omega)D(\omega-\mu)d\mu \cr
+&= {1\over
P}\displaystyle\sum_{n=-\infty}^{\infty}\int_{-\infty}^{\infty}F(\omega)\delta\left(\omega-\mu-{2{\pi}n\over
P}\right)d\mu \cr
+&= {1\over P}\displaystyle\sum_{n=-\infty}^{\infty}F\left(
+ \omega-{2{\pi}n\over P}\right).\cr }
+}
-If there are very high spatial-frequency signals in the image (for example,
fringes) which vary on a scale @emph{smaller than} your output image pixel size
(this is rarely the case in astronomical imaging), pixel mixing can cause
ailiasing@footnote{@url{http://en.wikipedia.org/wiki/Aliasing}}.
-Therefore, in case such fringes are present, they have to be calculated and
removed separately (which would naturally be done in any astronomical reduction
pipeline).
-Because of the PSF, no astronomical target has a sharp change in their signal.
-Thus this issue is less important for astronomical applications, see @ref{PSF}.
+@mymath{F(\omega)} was only a simple function, see @ref{samplingfreq}(left).
+However, from the sampled Fourier transform function we see that
@mymath{F_s(\omega)} is the superposition of infinite copies of
@mymath{F(\omega)} that have been shifted, see @ref{samplingfreq}(right).
+From the equation, it is clear that the shift in each copy is @mymath{2\pi/P}.
-To find the overlap area of the output pixel over the input pixels, we need to
define polygons and clip them (find the overlap).
-Usually, it is sufficient to define a pixel with a four-vertice polygon.
-However, when a non-linear distortion (for example, @code{SIP} or @code{TPV})
is present and the distortion is significant over an output pixel's size
(usually far from the reference point), the shadow of the output pixel on the
input grid can be curved.
-To account for such cases (which can only happen when correcting for
non-linear distortions), Warp has the @option{--edgesampling} option to sample
the output pixel over more vertices.
-For more, see the description of this option in @ref{Align pixels with WCS
considering distortions}.
+@float Figure,samplingfreq
+@image{gnuastro-figures/samplingfreq, 15.2cm, , } @caption{Sampling causes
infinite repetition in the frequency domain.
+FT is an abbreviation for `Fourier transform'.
+@mymath{\omega_m} represents the maximum frequency present in the input.
+@mymath{F(\omega)} is only symmetric on both sides of 0 when the input is real
(not complex).
+In general @mymath{F(\omega)} is complex and thus cannot be simply plotted
like this.
+Here we have assumed a real Gaussian @mymath{f(t)} which has produced a
Gaussian @mymath{F(\omega)}.}
+@end float
-@node Moire pattern and its correction, Invoking astwarp, Resampling, Warp
-@subsection Moir@'e pattern and its correction
+The input @mymath{f(l)} can have any distribution of frequencies in it.
+In the example of @ref{samplingfreq}(left), the input consisted of a range of
frequencies equal to @mymath{\Delta\omega=2\omega_m}.
+Fortunately as @ref{samplingfreq}(right) shows, the assumed pixel size
(@mymath{P}) we used to sample this hypothetical function was such that
@mymath{2\pi/P>\Delta\omega}.
+The consequence is that each copy of @mymath{F(\omega)} has become completely
separate from the surrounding copies.
+Such a digitized (sampled) data set is thus called @emph{over-sampled}.
+When @mymath{2\pi/P=\Delta\omega}, @mymath{P} is just small enough to finely
separate even the largest frequencies in the input signal and thus it is known
as @emph{critically-sampled}.
+Finally if @mymath{2\pi/P<\Delta\omega} we are dealing with an
@emph{under-sampled} data set.
+In an under-sampled data set, the separate copies of @mymath{F(\omega)} are
going to overlap and this will deprive us of recovering high constituent
frequencies of @mymath{f(l)}.
+The effects of under-sampling in an image with high rates of change (for
example, a brick wall imaged from a distance) can clearly be visually seen and
is known as @emph{aliasing}.
-@cindex Moir@'e pattern or fringes
-After warping some images with the default mode of Warp (see @ref{Align pixels
with WCS considering distortions}) you may notice that the background noise is
no longer flat.
-Some regions will be smoother and some will be sharper; depending on the
orientation and distortion of the input/output pixel grids.
-This is due to the @url{https://en.wikipedia.org/wiki/Moir%C3%A9_pattern,
Moir@'e pattern}, which is especially noticeable/significant when two slightly
different grids are super-imposed.
+When the input @mymath{f(l)} is composed of a finite range of frequencies,
@mymath{f(l)} is known as a @emph{band-limited} function.
+The example in @ref{samplingfreq}(left) was a nice demonstration of such a
case: for all @mymath{\omega<-\omega_m} or @mymath{\omega>\omega_m}, we have
@mymath{F(\omega)=0}.
+Therefore, when the input function is band-limited and our detector's pixels
are placed such that we have critically (or over-) sampled it, then we can
exactly reproduce the continuous @mymath{f(l)} from the discrete or digitized
samples.
+To do that, we just have to isolate one copy of @mymath{F(\omega)} from the
infinite copies and take its inverse Fourier transform.
-With the commands below, we'll download a single exposure image from the
@url{https://www.j-plus.es,J-PLUS survey} and run Warp (on a @mymath{8\times8}
arcmin@mymath{^2} region to speed it up the demos here).
-Finally, we'll open the image to visually see the artificial Moir@'e pattern
on the warped image.
+This ability to exactly reproduce the continuous input from the sampled or
digitized data leads us to the @emph{sampling theorem} which connects the
inherent property of the continuous signal (its maximum frequency) to that of
the detector (the spacing between its pixels).
+The sampling theorem states that the full (continuous) signal can be recovered
when the pixel size (@mymath{P}) and the maximum constituent frequency in the
signal (@mymath{\omega_m}) have the following relation@footnote{This equation
is also shown in some places without the @mymath{2\pi}.
+Whether @mymath{2\pi} is included or not depends on how you define the
frequency}:
-@example
-## Download the image (73.7 MB containing an 9216x9232 pixel image)
-$ jplusdr2=http://archive.cefca.es/catalogues/vo/siap/jplus-dr2/reduced
-$ wget $jplusdr2/get_fits?id=771463 -Ojplus-exp1.fits.fz
+@dispmath{{2\pi\over P}>2\omega_m}
-## Align a small part of it with the sky coordinates.
-$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
- --width=8/60 -ojplus-e1.fits
+@noindent
+This relation was first formulated by Harry Nyquist (1889 -- 1976 A.D.) in
1928 and formally proved in 1949 by Claude E. Shannon (1916 -- 2001 A.D.) in
what is now known as the Nyquist-Shannon sampling theorem.
+In signal processing, the signal is produced (synthesized) by a transmitter
and is received and de-coded (analyzed) by a receiver.
+Therefore producing a band-limited signal is necessary.
-## Open the aligned region with DS9
-$ astscript-fits-view jplus-e1.fits
-@end example
+In astronomy, we do not produce the shapes of our targets, we are only
observers.
+Galaxies can have any shape and size, therefore ideally, our signal is not
band-limited.
+However, since we are always confined to observing through an aperture, the
aperture will cause a point source (for which @mymath{\omega_m=\infty}) to be
spread over several pixels.
+This spread is quantitatively known as the point spread function or PSF.
+This spread does blur the image which is undesirable; however, for this
analysis it produces the positive outcome that there will be a finite
@mymath{\omega_m}.
+Though we should caution that any detector will have noise which will add lots
of very high frequency (ideally infinite) changes between the pixels.
+However, the coefficients of those noise frequencies are usually exceedingly
small.
-In the opened DS9 window, you can see the Moir@'e pattern as wave-like
patterns in the noise: some parts of the noise are more smooth and some parts
are more sharp.
-Right in the center of the image is a blob of sharp noise.
-Warp has the @option{--checkmaxfrac} option for direct inspection of the
Moir@'e pattern (described with the other options in @ref{Align pixels with WCS
considering distortions}).
-When run with this option, an extra HDU (called @code{MAX-FRAC}) will be added
to the output.
-The image in this HDU has the same size as the output.
-However, each output pixel will contain the largest (maximum) fraction of area
that it covered over the input pixel grid.
-So if an output pixel has a value of 0.9, this shows that it covered
@mymath{90\%} of an input pixel.
-Let's run Warp with @option{--checkmaxfrac} and see the output (after DS9
opens, in the ``Cube'' window, flip between the first and second HDUs):
+@node Discrete Fourier transform, Fourier operations in two dimensions,
Sampling theorem, Frequency domain and Fourier operations
+@subsubsection Discrete Fourier transform
-@example
-$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
- --width=8/60 -ojplus-e1.fits --checkmaxfrac
+As we have stated several times so far, the input image is a digitized,
pixelated or discrete array of values (@mymath{f_s(l)}, see @ref{Sampling
theorem}).
+The input is not a continuous function.
+Also, all our numerical calculations can only be done on a sampled, or
discrete Fourier transform.
+Note that @mymath{F_s(\omega)} is not discrete, it is continuous.
+One way would be to find the analytic @mymath{F_s(\omega)}, then sample it at
any desired ``freq-pixel''@footnote{We are using the made-up word
``freq-pixel'' so they are not confused with spatial domain ``pixels''.}
spacing.
+However, this process would involve two steps of operations and computers in
particular are not too good at analytic operations for the first step.
+So here, we will derive a method to directly find the `freq-pixel'ated
@mymath{F_s(\omega)} from the pixelated @mymath{f_s(l)}.
+Let's start with the definition of the Fourier transform (see @ref{Fourier
transform}):
-$ astscript-fits-view jplus-e1.fits
-@end example
+@dispmath{F_s(\omega)=\int_{-\infty}^{\infty}f_s(l)e^{-i{\omega}l}dl }
-By comparing the first and second HDUs/extensions, you will clearly see that
the regions with a sharp noise pattern fall exactly on parts of the
@code{MAX-FRAC} extension with values larger than 0.5.
-In other words, output pixels where one input pixel contributed more than half
of the its value.
-As this fraction increases, the sharpness also increases because a single
input pixel's value dominates the value of the output pixel.
-On the other hand, when this value is small, we see that many input pixels
contribute to that output pixel.
-Since many input pixels contribute to an output pixel, it acts like a
convolution, hence that output pixel becomes smoother (see @ref{Spatial domain
convolution}).
-Let's have a look at the distribution of the @code{MAX-FRAC} pixel values:
+@noindent
+From the definition of @mymath{f_s(\omega)} (using @mymath{x} instead of
@mymath{n}) we get:
-@example
-$ aststatistics jplus-e1.fits -hMAX-FRAC
-Statistics (GNU Astronomy Utilities) @value{VERSION}
--------
-Input: jplus-e1.fits (hdu: MAX-FRAC)
--------
- Number of elements: 744769
- Minimum: 0.250213461
- Maximum: 0.9987495374
- Mode: 0.5034223567
- Mode quantile: 0.3773819498
- Median: 0.5520805544
- Mean: 0.5693956458
- Standard deviation: 0.1554693738
--------
-Histogram:
- | ***
- | **********
- | *****************
- | ************************
- | *******************************
- | **************************************
- | *********************************************
- | ****************************************************
- | ***********************************************************
- | ******************************************************************
- |**********************************************************************
- |----------------------------------------------------------------------
-@end example
+@dispmath{
+\eqalign{
+ F_s(\omega) &= \displaystyle\sum_{x=-\infty}^{\infty}
+
\int_{-\infty}^{\infty}f(l)\delta(l-xP)e^{-i{\omega}l}dl \cr
+ &= \displaystyle\sum_{x=-\infty}^{\infty}
+ f_xe^{-i{\omega}xP}
+}
+}
-The smallest value is 0.25 (=1/4), showing that 4 input pixels contributed to
the output pixels value.
-While the maximum is almost 1.0, showing that a single input pixel defined the
output pixel value.
-You can also see that the most probable value (the mode) is 0.5, and that the
distribution is positively skewed.
+@noindent
+Where @mymath{f_x} is the value of @mymath{f(l)} on the point @mymath{x} or
the value of the @mymath{x}th pixel.
+As shown in @ref{Sampling theorem} this function is infinitely periodic with a
period of @mymath{2\pi/P}.
+So all we need is the values within one period: @mymath{0<\omega<2\pi/P}, see
@ref{samplingfreq}.
+We want @mymath{X} samples within this interval, so the frequency difference
between each frequency sample or freq-pixel is @mymath{1/XP}.
+Hence we will evaluate the equation above on the points at:
-@cindex Pixel scale
-@cindex @code{CDELT}
-This is a well-known problem in astronomical imaging and professional
photography.
-If you only have a single image (that is already taken!), you can undersample
the input: set the angular size of the output pixels to be larger than the
input.
-This will decrease the resolution of your image, but will ensure that
pixel-mixing will always happen.
-In the example below we are setting the output pixel scale (which is known as
@code{CDELT} in the FITS standard) to @mymath{1/0.5=2} of the input's.
-In other words each output pixel edge will cover double the input pixel's edge
on the sky, and the output's number of pixels in each dimension will be half of
the previous output.
+@dispmath{\omega={u\over XP} \quad\quad u = 0, 1, 2, ..., X-1}
-@example
-$ cdelt=$(astfits jplus-exp1.fits.fz --pixelscale -q \
- | awk '@{print $1@}')
-$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
- --width=8/60 -ojplus-e1.fits --cdelt=$cdelt/0.5 \
- --checkmaxfrac
-@end example
+@noindent
+Therefore the value of the freq-pixel @mymath{u} in the frequency
+domain is:
-In the first extension, you can hardly see any Moir@'e pattern in the noise.
-When you go to the next (@code{MAX-FRAC}) extension, you will see that almost
all the pixels have a value of 1.
-Of course, decreasing the resolution by half is a little too drastic.
-Depending on your image, you may be able to reach a sufficiently good result
without such a drastic degrading of the input image.
-For example, if you want an output pixel scale that is just 1.5 times larger
than the input, you can divide the original coordinate-delta (or ``cdelt'') by
@mymath{1/1.5=0.6666} and try again.
-In the @code{MAX-FRAC} extension, you will see that the range of pixel values
is now between 0.56 to 1.0 (recall that originally, this was between 0.25 and
1.0).
-This shows that the pixels are more similarly mixed and in fact, when you look
at the actual warped image, you can hardly distinguish any Moir@'e pattern in
the noise.
+@dispmath{F_u=\displaystyle\sum_{x=0}^{X-1} f_xe^{-i{ux\over X}} }
-@cindex Stacking
-@cindex Dithering
-@cindex Coaddition
-However, deep astronomical data are usually built by several exposures
(images), not a single one.
-Each image is also taken by (slightly) shifting the telescope compared to the
previous exposure.
-This shift is known as ``dithering''.
-We do this for many reasons (for example tracking errors in the telescope,
high background values, removing the effect of bad pixels or those affected by
cosmic rays, robust flat pattern measurement, etc.@footnote{E.g.,
@url{https://www.stsci.edu/hst/instrumentation/wfc3/proposing/dithering-strategies}}).
-One of those ``etc.'' reasons is to correct the Moir@'e pattern in the final
coadded deep image.
+@noindent
+Therefore, we see that for each freq-pixel in the frequency domain, we are
going to need all the pixels in the spatial domain@footnote{So even if one
pixel is a blank pixel (see @ref{Blank pixels}), all the pixels in the
frequency domain will also be blank.}.
+If the input (spatial) pixel row is also @mymath{X} pixels wide, then we can
exactly recover the @mymath{x}th pixel with the following summation:
-The Moir@'e pattern is fixed to the grid of the image, slightly shifting the
telescope will result in the pattern appearing in different parts of the sky.
-Therefore when we later stack, or coadd, the separate exposures into a deep
image, the Moir@'e pattern will be decreased there.
-However, dithering has possible drawbacks based on the scientific goal.
-For example when observing time-variable phenomena where cutting the exposures
to several shorter ones is not feasible.
-If this is not the case for you (for example in galaxy evolution), continue
with the rest of this section.
+@dispmath{f_x={1\over X}\displaystyle\sum_{u=0}^{X-1} F_ue^{i{ux\over X}} }
-Because we have multiple exposures that are slightly (sub-pixel) shifted, we
can also increase the spatial resolution of the output.
-For example, let's set the output coordinate-delta (or pixel scale) to be 1/2
of the input.
-In other words, the number of pixels in each dimension of the output is double
the first Warp command of this section:
+When the input pixel row (we are still only working on 1D data) has @mymath{X}
pixels, then it is @mymath{L=XP} spatial units wide.
+@mymath{L}, or the length of the input data was defined in @ref{Fourier
series} and @mymath{P} or the space between the pixels in the input was defined
in @ref{Dirac delta and comb}.
+As we saw in @ref{Sampling theorem}, the input (spatial) pixel spacing
(@mymath{P}) specifies the range of frequencies that can be studied and in
@ref{Fourier series} we saw that the length of the (spatial) input,
(@mymath{L}) determines the resolution (or size of the freq-pixels) in our
discrete Fourier transformed image.
+Both result from the fact that the frequency domain is the inverse of the
spatial domain.
-@example
-$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
- --width=8/60 -ojplus-e1.fits --cdelt=$cdelt/2 \
- --checkmaxfrac
+@node Fourier operations in two dimensions, Edges in the frequency domain,
Discrete Fourier transform, Frequency domain and Fourier operations
+@subsubsection Fourier operations in two dimensions
-$ aststatistics jplus-e1.fits -hMAX-FRAC --minimum --maximum
-0.06263604388 0.2506802701
+Once all the relations in the previous sections have been clearly understood
in one dimension, it is very easy to generalize them to two or even more
dimensions since each dimension is by definition independent.
+Previously we defined @mymath{l} as the continuous variable in 1D and the
inverse of the period in its direction to be @mymath{\omega}.
+Let's show the second spatial direction with @mymath{m} the inverse of the
period in the second dimension with @mymath{\nu}.
+The Fourier transform in 2D (see @ref{Fourier transform}) can be written as:
-$ astscript-fits-view jplus-e1.fits
-@end example
+@dispmath{F(\omega, \nu)=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}
+f(l, m)e^{-i({\omega}l+{\nu}m)}dl}
-From the last command, you see that like the previous change in
@option{--cdelt}, the range of @code{MAX-FRAC} has decreased.
-However, when you look at the warped image and the @code{MAX-FRAC} image with
the last command, you still visually see the Moir@'e pattern in the noise
(although it has significantly decreased compared to the original resolution).
-It is still present because 2 is an exact multiple of 1.
-Let's try increasing the resolution (oversampling) by a factor of 1.25 (which
isn't an exact multiple of 1):
+@dispmath{f(l, m)=\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}
+F(\omega, \nu)e^{i({\omega}l+{\nu}m)}dl}
-@example
-$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
- --width=8/60 -ojplus-e1.fits --cdelt=$cdelt/1.25 \
- --checkmaxfrac
-$ astscript-fits-view jplus-e1.fits
-@end example
+The 2D Dirac @mymath{\delta(l,m)} is non-zero only when @mymath{l=m=0}.
+The 2D Dirac comb (or Dirac brush! See @ref{Dirac delta and comb}) can be
written in units of the 2D Dirac @mymath{\delta}.
+For most image detectors, the sides of a pixel are equal in both dimensions.
+So @mymath{P} remains unchanged, if a specific device is used which has
non-square pixels, then for each dimension a different value should be used.
-You don't see any Moir@'e pattern in the noise any more, but when you look at
the @code{MAX-FRAC} extension, you see it is very different from the ones you
had seen before.
-In the previous @code{MAX-FRAC} image, you could see large blobs of similar
values.
-But here, you see that the variation is almost on a pixel scale, and the
difference between one pixel to the next is not significant.
-This is why you don't see any Moir@'e pattern in the warped image.
+@dispmath{{\rm III}_P(l, m)\equiv\displaystyle\sum_{j=-\infty}^{\infty}
+\displaystyle\sum_{k=-\infty}^{\infty}
+\delta(l-jP, m-kP) }
-In J-PLUS, each part of the sky was observed with a three-point dithering
pattern.
-Let's download the other two exposures and warp the same region of the sky to
the same pixel grid (using the @option{--gridfile} feature).
-Then, let's open all three cropped images in one DS9 instance:
+The Two dimensional Sampling theorem (see @ref{Sampling theorem}) is thus very
easily derived as before since the frequencies in each dimension are
independent.
+Let's take @mymath{\nu_m} as the maximum frequency along the second dimension.
+Therefore the two dimensional sampling theorem says that a 2D band-limited
function can be recovered when the following conditions hold@footnote{If the
pixels are not a square, then each dimension has to use the respective pixel
size, but since most detectors have square pixels, we assume so here too}:
-@example
-$ wget $jplusdr2/get_fits?id=771465 -Ojplus-exp2.fits.fz
-$ wget $jplusdr2/get_fits?id=771467 -Ojplus-exp3.fits.fz
+@dispmath{ {2\pi\over P} > 2\omega_m \quad\quad\quad {\rm and}
+\quad\quad\quad {2\pi\over P} > 2\nu_m}
-$ astwarp jplus-exp2.fits.fz --gridfile jplus-e1.fits \
- -o jplus-e2.fits --checkmaxfrac
-$ astwarp jplus-exp3.fits.fz --gridfile jplus-e1.fits \
- -o jplus-e3.fits --checkmaxfrac
+Finally, let's represent the pixel counter on the second dimension in the
spatial and frequency domains with @mymath{y} and @mymath{v} respectively.
+Also let's assume that the input image has @mymath{Y} pixels on the second
dimension.
+Then the two dimensional discrete Fourier transform and its inverse (see
@ref{Discrete Fourier transform}) can be written as:
+
+@dispmath{F_{u,v}=\displaystyle\sum_{x=0}^{X-1}\displaystyle\sum_{y=0}^{Y-1}
+f_{x,y}e^{-i({ux\over X}+{vy\over Y})} }
+
+@dispmath{f_{x,y}={1\over
XY}\displaystyle\sum_{u=0}^{X-1}\displaystyle\sum_{v=0}^{Y-1}
+F_{u,v}e^{i({ux\over X}+{vy\over Y})} }
+
+
+@node Edges in the frequency domain, , Fourier operations in two dimensions,
Frequency domain and Fourier operations
+@subsubsection Edges in the frequency domain
+
+With a good grasp of the frequency domain, we can revisit the problem of
convolution on the image edges, see @ref{Edges in the spatial domain}.
+When we apply the convolution theorem (see @ref{Convolution theorem}) to
convolve an image, we first take the discrete Fourier transforms (DFT,
@ref{Discrete Fourier transform}) of both the input image and the kernel, then
we multiply them with each other and then take the inverse DFT to construct the
convolved image.
+Of course, in order to multiply them with each other in the frequency domain,
the two images have to be the same size, so let's assume that we pad the kernel
(it is usually smaller than the input image) with zero valued pixels in both
dimensions so it becomes the same size as the input image before the DFT.
+
+Having multiplied the two DFTs, we now apply the inverse DFT which is where
the problem is usually created.
+If the DFT of the kernel only had values of 1 (unrealistic condition!) then
there would be no problem and the inverse DFT of the multiplication would be
identical with the input.
+However in real situations, the kernel's DFT has a maximum of 1 (because the
sum of the kernel has to be one, see @ref{Convolution process}) and decreases
something like the hypothetical profile of @ref{samplingfreq}.
+So when multiplied with the input image's DFT, the coefficients or magnitudes
(see @ref{Circles and the complex plane}) of the smallest frequency (or the sum
of the input image pixels) remains unchanged, while the magnitudes of the
higher frequencies are significantly reduced.
+
+As we saw in @ref{Sampling theorem}, the Fourier transform of a discrete input
will be infinitely repeated.
+In the final inverse DFT step, the input is in the frequency domain (the
multiplied DFT of the input image and the kernel DFT).
+So the result (our output convolved image) will be infinitely repeated in the
spatial domain.
+In order to accurately reconstruct the input image, we need all the
frequencies with the correct magnitudes.
+However, when the magnitudes of higher frequencies are decreased, longer
periods (shorter frequencies) will dominate in the reconstructed pixel values.
+Therefore, when constructing a pixel on the edge of the image, the newly
empowered longer periods will look beyond the input image edges and will find
the repeated input image there.
+So if you convolve an image in this fashion using the convolution theorem,
when a bright object exists on one edge of the image, its blurred wings will be
present on the other side of the convolved image.
+This is often termed as circular convolution or cyclic convolution.
+
+So, as long as we are dealing with convolution in the frequency domain, there
is nothing we can do about the image edges.
+The least we can do is to eliminate the ghosts of the other side of the image.
+So, we add zero valued pixels to both the input image and the kernel in both
dimensions so the image that will be convolved has a size equal to the sum of
both images in each dimension.
+Of course, the effect of this zero-padding is that the sides of the output
convolved image will become dark.
+To put it another way, the edges are going to drain the flux from nearby
objects.
+But at least it is consistent across all the edges of the image and is
predictable.
+In Convolve, you can see the padded images when inspecting the frequency
domain convolution steps with the @option{--viewfreqsteps} option.
-$ astscript-fits-view jplus-e*.fits
-@end example
+
+@node Spatial vs. Frequency domain, Convolution kernel, Frequency domain and
Fourier operations, Convolve
+@subsection Spatial vs. Frequency domain
+
+With the discussions above it might not be clear when to choose the spatial
domain and when to choose the frequency domain.
+Here we will try to list the benefits of each.
@noindent
-In the three warped images, you don't see any Moir@'e pattern, so far so
good...
-now, take the following steps:
-@enumerate
+The spatial domain,
+@itemize
@item
-Click on the ``Frame'' button (in the top row of buttons just on top of the
image), and select the ``Single'' button in the bottom row.
+Can correct for the edge effects of convolution, see @ref{Edges in the spatial
domain}.
+
@item
-Open the ``Zoom'' menu, and select ``Zoom 16''.
+Can operate on blank pixels.
+
@item
-In the bottom row of buttons right on top of the image, press the ``next''
button to flip through each exposure's @code{MAX-FRAC} extension.
+Can be faster than frequency domain when the kernel is small (in terms of the
number of pixels on the sides).
+@end itemize
+
+@noindent
+The frequency domain,
+@itemize
@item
-Focus your eyes on the pixels with the largest value (white colored pixels),
while pressing the ``next'' button to flip between the exposures.
-You will see that in each exposure they cover different pixels.
-@end enumerate
+Will be much faster when the image and kernel are both large.
+@end itemize
-The exercise above shows that the effect varying smoothing level (that had
already shrank to a per-pixel level) will be further decreased after we stack
the images.
-So let's stack these three images with the commands below.
-First, we need to remove the sky-level from each image using
@ref{NoiseChisel}, then we'll stack the @code{INPUT-NO-SKY} extensions using
sigma-clipping (to reject outliers by @ref{Sigma clipping}, using the
@ref{Stacking operators}).
+@noindent
+As a general rule of thumb, when working on an image of modeled profiles use
the frequency domain and when working on an image of real (observed) objects
use the spatial domain (corrected for the edges).
+The reason is that if you apply a frequency domain convolution to a real
image, you are going to loose information on the edges and generally you do not
want large kernels.
+But when you have made the profiles in the image yourself, you can just make a
larger input image and crop the central parts to completely remove the edge
effect, see @ref{If convolving afterwards}.
+Also due to oversampling, both the kernels and the images can become very
large and the speed boost of frequency domain convolution will significantly
improve the processing time, see @ref{Oversampling}.
-@example
-$ astnoisechisel jplus-e1.fits -ojplus-nc1.fits
-$ astnoisechisel jplus-e2.fits -ojplus-nc2.fits
-$ astnoisechisel jplus-e3.fits -ojplus-nc3.fits
+@node Convolution kernel, Invoking astconvolve, Spatial vs. Frequency domain,
Convolve
+@subsection Convolution kernel
-$ astarithmetic jplus-nc*.fits 3 5 0.2 sigclip-mean \
- -gINPUT-NO-SKY -ojplus-stack.fits
+All the programs that need convolution will need to be given a convolution
kernel file and extension.
+In most cases (other than Convolve, see @ref{Convolve}) the kernel file name
is optional.
+However, the extension is necessary and must be specified either on the
command-line or at least one of the configuration files (see @ref{Configuration
files}).
+Within Gnuastro, there are two ways to create a kernel image:
-$ astscript-fits-view jplus-nc*.fits jplus-stack.fits
-@end example
+@itemize
+
+@item
+MakeProfiles: You can use MakeProfiles to create a parametric (based on a
radial function) kernel, see @ref{MakeProfiles}.
+By default MakeProfiles will make the Gaussian and Moffat profiles in a
separate file so you can feed it into any of the programs.
+
+@item
+ConvertType: You can write your own desired kernel into a text file table and
convert it to a FITS file with ConvertType, see @ref{ConvertType}.
+Just be careful that the kernel has to have an odd number of pixels along its
two axes, see @ref{Convolution process}.
+All the programs that do convolution will normalize the kernel internally, so
if you choose this option, you do not have to worry about normalizing the
kernel.
+Only within Convolve, there is an option to disable normalization, see
@ref{Invoking astconvolve}.
+
+@end itemize
@noindent
-After opening the individual exposures and the final stack with the last
command, take the following steps to see the comparisons properly:
-@enumerate
+The two options to specify a kernel file name and its extension are shown
below.
+These are common between all the programs that will do convolution.
+@table @option
+@item -k FITS
+@itemx --kernel=FITS
+The convolution kernel file name.
+The @code{BITPIX} (data type) value of this file can be any standard type and
it does not necessarily have to be normalized.
+Several operations will be done on the kernel image prior to the program's
processing:
+
+@itemize
+
@item
-Click on the stack image so it is selected.
+It will be converted to floating point type.
+
@item
-Go to the ``Frame'' menu, then the ``Lock'' item, then activate ``Scale and
Limits''.
+All blank pixels (see @ref{Blank pixels}) will be set to zero.
+
@item
-Scroll your mouse or touchpad to zoom into the image.
-@end enumerate
+It will be normalized so the sum of its pixels equal unity.
-@noindent
-You clearly see that the stacked image is deeper and that there is no Moir@'e
pattern, while you have slightly @emph{improved} the spatial resolution of the
output compared to the input.
-In case you want the stack to have the original pixel resolution, you just
need one more warp:
+@item
+It will be flipped so the convolved image has the same orientation.
+This is only relevant if the kernel is not circular. See @ref{Convolution
process}.
+@end itemize
-@example
-$ astwarp jplus-stack.fits --cdelt=$cdelt -ojplus-stack-origres.fits
-@end example
+@item -U STR
+@itemx --khdu=STR
+The convolution kernel HDU.
+Although the kernel file name is optional, before running any of the programs,
they need to have a value for @option{--khdu} even if the default kernel is to
be used.
+So be sure to keep its value in at least one of the configuration files (see
@ref{Configuration files}).
+By default, the system configuration file has a value.
-For optimal results, the oversampling should be determined by the dithering
pattern of the observation:
-For example if you only have two dither points, you want the pixels with
maximum value in the @code{MAX-FRAC} image of one exposure to fall on those
with a minimum value in the other exposure.
-Ideally, many more dither points should be chosen when you are planning your
observation (not just for the Moir@'e pattern, but also for all the other
reasons mentioned above).
-Based on the dithering pattern, you want to select the increased resolution
such that the maximum @code{MAX-FRAC} values fall on every different pixel of
the output grid for each exposure.
+@end table
-@node Invoking astwarp, , Moire pattern and its correction, Warp
-@subsection Invoking Warp
-Warp will warp an input image into a new pixel grid by pixel mixing (see
@ref{Resampling}).
-Without any options, Warp will remove any non-linear distortions from the
image and align the output pixel coordinates to its WCS coordinates.
-Any homographic warp (for example, scaling, rotation, translation, projection,
see @ref{Linear warping basics}) can also be done by calling the relevant
option explicitly.
-The general template for invoking Warp is:
+@node Invoking astconvolve, , Convolution kernel, Convolve
+@subsection Invoking Convolve
+
+Convolve an input dataset (2D image or 1D spectrum for example) with a known
kernel, or make the kernel necessary to match two PSFs.
+The general template for Convolve is:
@example
-$ astwarp [OPTIONS...] InputImage
+$ astconvolve [OPTION...] ASTRdata
@end example
@noindent
One line examples:
@example
-## Align image with celestial coordinates and remove any distortion
-$ astwarp image.fits
-
-## Align four exposures to same pixel grid and stack them with
-## Arithmetic program's sigma-clipped mean operator (out of many
-## stacking operators, see Arithmetic's documentation).
-$ grid="--center=1.234,5.678 --widthinpix=1001,1001 --cdelt=0.2/3600"
-$ astwarp a.fits $grid --output=A.fits
-$ astwarp b.fits $grid --output=B.fits
-$ astwarp c.fits $grid --output=C.fits
-$ astwarp d.fits $grid --output=D.fits
-$ astarithmetic A.fits B.fits C.fits D.fits 4 5 0.2 sigclip-mean \
- -g1 --output=stack.fits
-
-## Warp a previously created mock image to the same pixel grid as the
-## real image (including any distortions).
-$ astwarp mock.fits --gridfile=real.fits
+## Convolve mockimg.fits with psf.fits:
+$ astconvolve --kernel=psf.fits mockimg.fits
-## Rotate and then scale input image:
-$ astwarp --rotate=37.92 --scale=0.8 image.fits
+## Convolve in the spatial domain:
+$ astconvolve observedimg.fits --kernel=psf.fits --domain=spatial
-## Scale, then translate the input image:
-$ astwarp --scale 8/3 --translate 2.1 image.fits
+## Convolve a 3D cube (only spatial domain is supported in 3D).
+## It is also necessary to define 3D tiles and channels for
+## parallelization (see the Tessellation section for more).
+$ astconvolve cube.fits --kernel=kernel3d.fits --domain=spatial \
+ --tilesize=30,30,30 --numchannels=1,1,1
-## Directly input a custom warping matrix (using fraction):
-$ astwarp --matrix=1/5,0,4/10,0,1/5,4/10,0,0,1 image.fits
+## Find the kernel to match sharper and blurry PSF images (they both
+## have to have the same pixel size).
+$ astconvolve --kernel=sharperimage.fits --makekernel=10 \
+ blurryimage.fits
-## Directly input a custom warping matrix, with final numbers:
-$ astwarp --matrix="0.7071,-0.7071, 0.7071,0.7071" image.fits
+## Convolve a Spectrum (column 14 in the FITS table below) with a
+## custom kernel (the kernel will be normalized internally, so only
+## the ratios are important). Sed is used to replace the spaces with
+## new line characters so Convolve sees them as values in one column.
+$ echo "1 3 10 3 1" | sed 's/ /\n/g' | astconvolve spectra.fits -c14
@end example
-If any processing is to be done, Warp needs to be given a 2D FITS image.
-As in all Gnuastro programs, when an output is not explicitly set with the
@option{--output} option, the output filename will be set automatically based
on the operation, see @ref{Automatic output}.
-For the full list of general options to all Gnuastro programs (including
Warp), please see @ref{Common options}.
-
-Warp uses pixel mixing to derive the pixel values of the output image, see
@ref{Resampling}.
-To be the most accurate, the input image will be read as a 64-bit double
precision floating point dataset and all internal processing is done in this
format.
-Upon writing, by default it will be converted to 32-bit single precision
floating point type (actual observational data rarely have such precision!).
-In case you want a different output type, you can use the @option{--type}
option that is common to several Gnuastro programs.
-For example, if your input is a mock image without noise, and you want to
preserve the 64-bit precision, use (with @option{--type=float64}.
-Just note that the file size will also be double!
-For more on the precision of various types, see @ref{Numeric data types}.
-
-By default (if no linear operation is requested), Warp will align the pixel
grid of the input image to the WCS coordinates it contains.
-This operation and the the options that govern it are described in @ref{Align
pixels with WCS considering distortions}.
-You can Warp an input image to the same pixel grid as a reference FITS file
using the @option{--wcsfile} option.
-In this case, the output image will take all the information needed from the
reference WCS file and HDU/extension specified with @option{--wcshdu}, thus it
will discard any other resampling options given.
+The only argument accepted by Convolve is an input image file.
+Some of the options are the same between Convolve and some other Gnuastro
programs.
+Therefore, to avoid repetition, they will not be repeated here.
+For the full list of options shared by all Gnuastro programs, please see
@ref{Common options}.
+In particular, in the spatial domain, on a multi-dimensional datasets,
convolve uses Gnuastro's tessellation to speed up the run, see
@ref{Tessellation}.
+Common options related to tessellation are described in @ref{Processing
options}.
-If you need any custom linear warping (independent of the WCS, see @ref{Linear
warping basics}), you need to call the respective operation manually.
-These are described in @ref{Linear warps to be called explicitly}.
-Please note that you may not use both linear and non-linear modes
simultaneously.
-For example, you cannot scale or rotate the image while removing its
non-linear distortions at the same time.
+1-dimensional datasets (for example, spectra) are only read as columns within
a table (see @ref{Tables} for more on how Gnuastro programs read tables).
+Note that currently 1D convolution is only implemented in the spatial domain
and thus kernel-matching is also not supported.
-The following options are shared between both modes:
+Here we will only explain the options particular to Convolve.
+Run Convolve with @option{--help} in order to see the full list of options
Convolve accepts, irrespective of where they are explained in this book.
@table @option
-@item --hstartwcs=INT
-Specify the first header keyword number (line) that should be used to read the
WCS information, see the full explanation in @ref{Invoking astcrop}.
-@item --hendwcs=INT
-Specify the last header keyword number (line) that should be used to read the
WCS information, see the full explanation in @ref{Invoking astcrop}.
+@item --kernelcolumn
+Column containing the 1D kernel.
+When the input dataset is a 1-dimensional column, and the host table has more
than one column, use this option to specify which column should be used.
-@item -C FLT
-@itemx --coveredfrac=FLT
-Depending on the warp, the output pixels that cover pixels on the edge of the
input image, or blank pixels in the input image, are not going to be fully
covered by input data.
-With this option, you can specify the acceptable covered fraction of such
pixels (any value between 0 and 1).
-If you only want output pixels that are fully covered by the input image area
(and are not blank), then you can set @option{--coveredfrac=1} (which is the
default!).
-Alternatively, a value of @code{0} will keep output pixels that are even
infinitesimally covered by the input.
-As a result, with @option{--coveredfrac=0}, the sum of the pixels in the input
and output images will be exactly the same.
-@end table
+@item --nokernelflip
+Do not flip the kernel after reading; only for spatial domain convolution.
+This can be useful if the flipping has already been applied to the kernel.
+By default, the input kernel is flipped to avoid the output getting flipped;
see @ref{Convolution process}.
-@menu
-* Align pixels with WCS considering distortions:: Default operation.
-* Linear warps to be called explicitly:: Other warps.
-@end menu
+@item --nokernelnorm
+Do not normalize the kernel after reading it, such that the sum of its pixels
is unity.
+As described in @ref{Convolution process}, the kernel is normalized by default.
-@node Align pixels with WCS considering distortions, Linear warps to be called
explicitly, Invoking astwarp, Invoking astwarp
-@subsubsection Align pixels with WCS considering distortions
+@item -d STR
+@itemx --domain=STR
+@cindex Discrete Fourier transform
+The domain to use for the convolution.
+The acceptable values are `@code{spatial}' and `@code{frequency}',
corresponding to the respective domain.
-@cindex Resampling
-@cindex WCS distortion
-@cindex TPV distortion
-@cindex SIP distortion
-@cindex Non-linear distortion
-@cindex Align pixel and WCS coordinates
-When none of the linear warps@footnote{For linear warps, see @ref{Linear warps
to be called explicitly}.} are requested, Warp will align the input's pixel
axes with it's WCS axes.
-In the process, any possibly existing distortion is also removed (such as
@code{TPV} and @code{SIP}).
-Usually, the WCS axes are the Right Ascension and Declination in equatorial
coordinates.
-The output image's pixel grid is highly customizable through the options in
this section.
-To learn about Warp's strategy to build the new pixel grid, see
@ref{Resampling}.
-For strong distortions (that produce strong curvatures), you can fine-tune the
area-based resampling with @option{--edgesampling}, as described below.
+For large images, the frequency domain process will be more efficient than
convolving in the spatial domain.
+However, the edges of the image will loose some flux (see @ref{Edges in the
spatial domain}) and the image must not contain any blank pixels, see
@ref{Spatial vs. Frequency domain}.
-On the other hand, sometimes you need to Warp an input image to the exact same
grid of an already available reference FITS image with an existing WCS.
-If that image is already aligned, finding its center, number of pixels and
pixel scale can be annoying (and just increase the complexity of your script).
-On the other hand, if that image is not aligned (for example, has a certain
rotation in the sky, and has a different distortion), there are too many WCS
parameters to set (some are not yet available explicitly in the options here)!
-For such scenarios, Warp has the @option{--gridfile} option.
-When @option{--gridfile} is called, the options below that are used to define
the output's WCS will be ignored (these options: @option{--center},
@option{--widthinpix}, @option{--cdelt}, @option{--ctype}).
-In this case, the output's WCS and pixel grid will exactly match the image
given to @option{--gridfile} (including any rotation, pixel scale, or
distortion or projection).
-@cartouche
-@noindent
-@cindex Stacking
-@cindex Coaddition
-@strong{Set @option{--cdelt} explicitly when you plan to stack many warped
images:}
-To align some images and later stack them, it is necessary to be sure the
pixel sizes of all the images are the same exactly.
-Most of the time the measured (during astrometry) pixel scale of the separate
exposures, will be different in the second or third digit number after the
decimal point.
-It is a normal/statistical error in measuring the astrometry.
-On a large image, these slight differences can cause different output sizes
(of one or two pixels on a very large image).
+@item --checkfreqsteps
+With this option a file with the initial name of the output file will be
created that is suffixed with @file{_freqsteps.fits}, all the steps done to
arrive at the final convolved image are saved as extensions in this file.
+The extensions in order are:
-You can fix this by explicitly setting the pixel scale of each warped exposure
with Warp's @option{--cdelt} option that is described below.
-For good strategies of setting the pixel scale, see @ref{Moire pattern and its
correction}.
-@end cartouche
+@enumerate
+@item
+The padded input image.
+In frequency domain convolution the two images (input and convolved) have to
be the same size and both should be padded by zeros.
-Another problem that may arise when aligning images to new pixel grids is the
aliasing or visible Moir@'e patterns on the output image.
-This artifact should be removed if you are stacking several exposures,
especially with a dithering pattern.
-If not see @ref{Moire pattern and its correction} for ways to mitigate the
visible patterns.
-See the description of @option{--gridfile} below for more.
+@item
+The padded kernel, similar to the above.
-@cartouche
-@noindent
-@cindex WCSLIB
-@strong{Known issue:} Warp's WCS-based aligning works best with WCSLIB version
7.12 (released in September 2022) and above.
-If you have an older version of WCSLIB, you might get a @code{wcss2p} error
otherwise.
-@end cartouche
+@item
+@cindex Phase angle
+@cindex Complex numbers
+@cindex Numbers, complex
+@cindex Fourier spectrum
+@cindex Spectrum, Fourier
+The Fourier spectrum of the forward Fourier transform of the input image.
+Note that the Fourier transform is a complex operation (and not view able in
one image!) So we either have to show the `Fourier spectrum' or the `Phase
angle'.
+For the complex number @mymath{a+ib}, the Fourier spectrum is defined as
@mymath{\sqrt{a^2+b^2}} while the phase angle is defined as
@mymath{\arctan(b/a)}.
-@table @option
-@item -c FLT,FLT
-@itemx --center=FLT,FLT
-@cindex CRVALi
-@cindex Aligning an image
-WCS coordinates of the center of the central pixel of the output image.
-Since a central pixel is only defined with an odd number of pixels along both
dimensions, the output will always have an odd number of pixels.
-When @option{--center} or @option{--gridfile} aren't given, the output will
have the same central WCS coordinate as the input.
+@item
+The Fourier spectrum of the forward Fourier transform of the kernel image.
-Usually, the WCS coordinates are Right Ascension and Declination (when the
first three characters of @code{CTYPE1} and @code{CTYPE2} are respectively
@code{RA-} and @code{DEC}).
-For more on the @code{CTYPEi} keyword values, see @code{--ctype} below.
+@item
+The Fourier spectrum of the multiplied (through complex arithmetic)
transformed images.
-@item -w INT[,INT]
-@itemx --width=INT[,INT]
-Width and height of the output image in units of WCS (usually degrees).
-If you want the values to be read as pixels, also call the
@option{--widthinpix} option with @option{--width}.
-If a single value is given, Warp will use the same value for the second
dimension (creating a square output).
-When @option{--width} or @option{--gridfile} aren't given, Warp will calculate
the necessary size of the output pixel grid to fully contain the input image.
+@item
+@cindex Round-off error
+@cindex Floating point round-off error
+@cindex Error, floating point round-off
+The inverse Fourier transform of the multiplied image.
+If you open it, you will see that the convolved image is now in the center,
not on one side of the image as it started with (in the padded image of the
first extension).
+If you are working on a mock image which originally had pixels of precisely
0.0, you will notice that in those parts that your convolved profile(s) did not
convert, the values are now @mymath{\sim10^{-18}}, this is due to
floating-point round off errors.
+Therefore in the final step (when cropping the central parts of the image), we
also remove any pixel with a value less than @mymath{10^{-17}}.
+@end enumerate
-Usually the WCS coordinates are in units of degrees (defined by the
@code{CUNITi} keywords of the FITS standard).
-But entering a certain number of arcseconds or arcminutes for the width can be
annoying (you will usually need to go to the calculator!).
-To simplify such situations, this option also accepts division.
-For example @option{--width=1/60,2/60} will make an aligned warp that is 1
arcmin along Right Ascension and 2 arcminutes along the Declination.
+@item --noedgecorrection
+Do not correct the edge effect in spatial domain convolution (this correction
is done in spatial domain convolution by default).
+For a full discussion, please see @ref{Edges in the spatial domain}.
-With the @option{--widthinpix} option the values will be interpreted as
numbers of pixels.
-In this scenario, this option should be given @emph{odd} integer(s) that are
greater than 1.
-This ensures that the output image can have a @emph{central} pixel.
-Recall that through the @option{--center} option, you specify the WCS
coordinate of the center of the central pixel.
-The central coordinate of an image with an even number of pixels will be on
the edge of two pixels, so a ``central'' pixel is not well defined.
-If any of the given values are even, Warp will automatically add a single
pixel (to make it an odd integer) and print a warning message.
+@item -m INT
+@itemx --makekernel=INT
+If this option is called, Convolve will do PSF-matching: the output will be
the kernel that you should convolve with the sharper image to obtain the blurry
one (see @ref{Convolution theorem}).
+The two images must have the same size (number of pixels).
+This option is not yet supported in 1-dimensional datasets.
+In effect, it is only necessary to give the two PSFs of your two datasets,
find the matching kernel based on that, then apply that kernel to the
higher-resolution (sharper image).
-@item --widthinpix
-When called, the values given to the @option{--width} option will be
interpreted as the number of pixels along each dimension(s).
-See the description of @option{--width} for more.
+The image given to the @option{--kernel} option is assumed to be the sharper
(less blurry) image and the input image (with no option) is assumed to be the
more blurry image.
+The value given to this option will be used as the maximum radius of the
kernel.
+Any pixel in the final kernel that is larger than this distance from the
center will be set to zero.
-@item -x FLT[,FLT]
-@itemx --cdelt=FLT[,FLT]
-@cindex CDELTi
-@cindex Pixel scale
-Coordinate deltas or increments (@code{CDELTi} in the FITS standard), or the
pixel scale in both dimensions.
-If a single value is given, it will be used for both axes.
-In this way, the output's pixels will be squares on the sky at the reference
point (as is usually expected!).
-When @option{--cdelt} or @option{--gridfile} aren't given, Warp will read the
input's pixel scale and choose the larger of @code{CDELT1} or @code{CDELT2} so
the output pixels are square.
+Noise has large frequencies which can make the result less reliable for the
higher frequencies of the final result.
+So all the frequencies which have a spectrum smaller than the value given to
the @option{minsharpspec} option in the sharper input image are set to zero and
not divided.
+This will cause the wings of the final kernel to be flatter than they would
ideally be which will make the convolved image result unreliable if it is too
high.
-Usually (when dealing with RA and Dec, and the @code{CUNITi}s have a value of
@code{deg}), the units of the given values are degrees/pixel.
-Warp allows you to easily convert from @emph{arcsec} to @emph{degrees} by
simply appending a @code{/3600} to the value.
-For example, for an output image of pixel scale @code{0.27} arcsec/pixel, you
can use @code{--cdelt=0.27/3600}.
+Some notes to on how to prepare your two input PSFs.
+Note that these (and several other issues that relate to an accurate
estimation of the PSF) are practically described in the following tutorial:
@ref{Building the extended PSF}.
+@itemize
+@item
+Choose a bright (unsaturated) star and use a region box (with Crop for
example, see @ref{Crop}) that is sufficiently above the noise.
-@item --ctype=STR,STR
-@cindex Align
-@cindex CTYPEi
-@cindex Resampling
-The coordinate types of the output (@code{CTYPE1} and @code{CTYPE2} keywords
in the FITS standard), separated by a comma.
-By default the value to this option is `@code{RA---TAN,DEC--TAN}'.
-However, if @option{--gridfile} is given, this option is ignored.
+@item
+Mask all background sources that may be nearby (you can use Segment's clumps,
see @ref{Segment}).
-If you don't call @option{--ctype} or @option{--gridfile}, the output WCS
coordinates will be Right Ascension and Declination, while the output's
projection will be
@url{https://en.wikipedia.org/wiki/Gnomonic_projection,Gnomonic}, also known as
Tangential (TAN).
-This combination is the most common in extra-galactic imaging surveys.
-For other coordinates and projections in your output use other values, as
described below.
+@item
+Use Warp (see @ref{Warp}) to warp the pixel grid so the star's center is
exactly on the center of the central pixel in the cropped image.
+This will certainly slightly degrade the result, however, it is necessary.
+If there are multiple good stars, you can shift all of them, then normalize
them (so the sum of each star's pixels is one) and then take their average to
decrease this effect.
-According to the FITS standard version 4.0@footnote{FITS standard version 4.0:
@url{https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf}}:
@code{CTYPEi} is the
-``type for the Intermediate-coordinate Axis @mymath{i}.
-Any coordinate type that is not covered by this Standard or an officially
recognized FITS convention shall be taken to be linear.
-All non-linear coordinate system names must be expressed in `4–3' form: the
first four characters specify the coordinate type, the fifth character is a
hyphen (@code{-}), and the remaining three characters specify an algorithm code
for computing the world coordinate value.
-Coordinate types with names of fewer than four characters are padded on the
right with hyphens, and algorithm codes with fewer than three characters are
padded on the right with SPACE.
-Algorithm codes should be three characters''
-(see list of algorithm codes below).
+@item
+The shifting might move the center of the star by one pixel in any direction,
so crop the central pixel of the warped image to have a clean image for the
de-convolution.
+@end itemize
-@cindex WCS Projections
-@cindex Projections (world coordinate system)
-You can use any of the projection algorithms (last three characters of each
coordinate's type) provided by your host WCSLIB (a mandatory dependency of
Gnuastro; see @ref{WCSLIB}).
-For a very elaborate and complete description of projection algorithms in the
FITS WCS standard, see @url{https://doi.org/10.1051/0004-6361:20021327,
Calabretta and Greisen 2002}.
-Wikipedia also has a nice article on
@url{https://en.wikipedia.org/wiki/Map_projection, Map projections}.
-As an example, WCSLIB 7.12 (released in September 2022) has the following
projection algorithms:
-@table @code
-@item AZP
-@cindex Zenithal/azimuthal projection
-Zenithal/azimuthal perspective
-@item SZP
-@cindex Slant zenithal projection
-Slant zenithal perspective
-@item TAN
-@cindex Gnomonic (tangential) projection
-Gnomonic (tangential)
-@item STG
-@cindex Stereographic projection
-Stereographic
-@item SIN
-@cindex Orthographic/synthesis projection
-Orthographic/synthesis
-@item ARC
-@cindex Zenithal/azimuthal equidistant projection
-Zenithal/azimuthal equidistant
-@item ZPN
-@cindex Zenithal/azimuthal polynomial projection
-Zenithal/azimuthal polynomial
-@item ZEA
-@cindex Zenithal/azimuthal equal area projection
-Zenithal/azimuthal equal area
-@item AIR
-@cindex Airy projection
-Airy
-@item CYP
-@cindex Cylindrical perspective projection
-Cylindrical perspective
-@item CEA
-@cindex Cylindrical equal area projection
-Cylindrical equal area
-@item CAR
-@cindex Plate carree projection
-Plate carree
-@item MER
-@cindex Mercator projection
-Mercator
-@item SFL
-@cindex Sanson-Flamsteed projection
-Sanson-Flamsteed
-@item PAR
-@cindex Parabolic projection
-Parabolic
-@item MOL
-@cindex Mollweide projection
-Mollweide
-@item AIT
-@cindex Hammer-Aitoff projection
-Hammer-Aitoff
-@item COP
-@cindex Conic perspective projection
-Conic perspective
-@item COE
-@cindex Conic equal area projection
-Conic equal area
-@item COD
-@cindex Conic equidistant projection
-Conic equidistant
-@item COO
-@cindex Conic orthomorphic projection
-Conic orthomorphic
-@item BON
-@cindex Bonne projection
-Bonne
-@item PCO
-@cindex Polyconic projection
-Polyconic
-@item TSC
-@cindex Tangential spherical cube projection
-Tangential spherical cube
-@item CSC
-@cindex COBE spherical cube projection
-COBE spherical cube
-@item QSC
-@cindex Quadrilateralized spherical cube projection
-Quadrilateralized spherical cube
-@item HPX
-@cindex HEALPix projection
-HEALPix
-@item XPH
-@cindex Butterfly projection
-@cindex HEALPix polar projection
-HEALPix polar, aka "butterfly"
+@item -c
+@itemx --minsharpspec
+(@option{=FLT}) The minimum frequency spectrum (or coefficient, or pixel value
in the frequency domain image) to use in deconvolution, see the explanations
under the @option{--makekernel} option for more information.
@end table
-@item -G
-@itemx --gridfile
-FITS filename containing the final pixel grid and WCS for the output image.
-The HDU/extension containing should be specified with @option{--gridhdu} or
its short option @option{-H}.
-The HDU should contain a WCS, otherwise, Warp will abort with a crash.
-When this option is used, Warp will read the respective WCS and the size of
the image to resample the input.
-Since this WCS of this HDU contains everything needed to construct the WCS the
options above will be ignored when @option{--gridfile} is called:
@option{--cdelt}, @option{--center}, and @option{--widthinpix}.
-In the example below, let's use this option to put the image of M51 in one
survey (J-PLUS) into the pixel grid of another survey (SDSS) containing M51.
-The J-PLUS field of view is very large (almost @mymath{1.5\times1.5}
deg@mymath{^2}, in @mymath{9500\times9500} pixels), while the field of view of
SDSS in each filter is small (almost @mymath{0.3\times0.25} deg@mymath{^2} in
@mymath{2048\times1489} pixels).
-With the first two commands, we'll first download the two images, then we'll
extract the portion of the J-PLUS image that overlaps with the SDSS image and
align it exactly to SDSS's pixel grid.
-Note that these are the two images that were used in two of Gnuastro's
tutorials: @ref{Building the extended PSF} and @ref{Detecting large extended
targets}.
-@example
-## Download the J-PLUS DR2 image of M51 in the r filter.
-$ jplusbase="http://archive.cefca.es/catalogues/vo/siap"
-$ wget $jplusbase/jplus-dr2/get_fits?id=67510 \
- -O jplus.fits.fz
-## Download the SDSS image in r filter and decompress it
-## (Bzip2 is not a standard FITS compression algorithm).
-$ sdssbase=https://dr12.sdss.org/sas/dr12/boss/photoObj/frames
-$ wget $sdssbase/301/3716/6/frame-r-003716-6-0117.fits.bz2 \
- -O sdss.fits.bz2
-$ bunzip2 sdss.fits.bz2
-## Warp and crop the J-PLUS image so the output exactly
-## matches the SDSS pixel gid.
-$ astwarp jplus.fits.fz --gridfile=sdss.fits --gridhdu=0 \
- --output=jplus-on-sdss.fits
-## View the two images side-by-side:
-$ astscript-fits-view sdss.fits jplus-on-sdss.fits
-@end example
-As the example above shows, this option can therefore be very useful when
comparing images from multiple surveys.
-But there are other very interesting use cases also.
-For example, when you are making a mock dataset and need to add distortion to
the image so it matches the distortion of your camera.
-Through @option{--gridhdu}, you can easily insert that distortion over the
mock image and put the mock image in the pixel grid of an exposure.
-@item -H
-@itemx --gridhdu
-The HDU/extension of the reference WCS file specified with option
@option{--wcsfile} or its short version @option{-H} (see the description of
@option{--wcsfile} for more).
-@item --edgesampling=INT
-Number of extra samplings along the edge of a pixel.
-By default the value is @code{0} (the output pixel's polygon over the input
will be a quadrilateral (a polygon with four edges/vertices).
-Warp uses pixel mixing to derive the output pixel values.
-For a complete introduction, see @ref{Resampling}, and in particular its later
part on distortions.
-To account for this possible curvature due to distortion, you can use this
option.
-For example, @option{--edgesampling=1} will add one extra vertice in the
middle of each edge of the output pixel, producing an 8-vertice polygon.
-Similarly, @option{--edgesampling=5} will put 5 extra vertices along each
edge, thus sampling the shape (and possible curvature) of the output pixel over
an input pixel with @mymath{4+5\times4=24} vertice polygon.
-Since the polygon clipping will happen for every output pixel, a higher value
to this option can significantly reduce the running speed and increase the RAM
usage of Warp; so use it with caution: in most cases the default
@option{--edgesampling=0} is sufficient.
-To visually inspect the curvature effect on pixel area of the input image, see
option @option{--pixelareaonwcs} in @ref{Pixel information images}.
+@node Warp, , Convolve, Data manipulation
+@section Warp
+Image warping is the process of mapping the pixels of one image onto a new
pixel grid.
+This process is sometimes known as transformation, however following the
discussion of Heckbert 1989@footnote{Paul S. Heckbert. 1989. @emph{Fundamentals
of Texture mapping and Image Warping}, Master's thesis at University of
California, Berkeley.} we will not be using that term because it can be
confused with only pixel value or flux transformations.
+Here we specifically mean the pixel grid transformation which is better
conveyed with `warp'.
-@item --checkmaxfrac
-Check each output pixel's maximum coverage on the input data and append as the
`@code{MAX-FRAC}' HDU/extension to the output aligned image.
-This option provides an easy visual inspection for possible recurring patterns
or fringes caused by aligning to a new pixel grid.
-For more detail about the origin of these patterns and how to mitigate them
see @ref{Moire pattern and its correction}.
+@cindex Gravitational lensing
+Image wrapping is a very important step in astronomy, both in observational
data analysis and in simulating modeled images.
+In modeling, warping an image is necessary when we want to apply grid
transformations to the initial models, for example, in simulating gravitational
lensing.
+Observational reasons for warping an image are listed below:
-Note that the `@code{MAX-FRAC}' HDU/extension is not showing the patterns
themselves;
-It represents the largest area coverage on the input data for that particular
pixel.
-The values can be in the range between 0 to 1, where 1 means the pixel is
covering at least one complete pixel of the input data.
-On the other hand, 0 means that the pixel is not covering any pixels of the
input at all.
-@end table
+@itemize
+@cindex Signal to noise ratio
+@item
+@strong{Noise:} Most scientifically interesting targets are inherently faint
(have a very low Signal to noise ratio).
+Therefore one short exposure is not enough to detect such objects that are
drowned deeply in the noise.
+We need multiple exposures so we can add them together and increase the
objects' signal to noise ratio.
+Keeping the telescope fixed on one field of the sky is practically impossible.
+Therefore very deep observations have to put into the same grid before adding
them.
+@cindex Mosaicing
+@cindex Image mosaic
+@item
+@strong{Resolution:} If we have multiple images of one patch of the sky
(hopefully at multiple orientations) we can warp them to the same grid.
+The multiple orientations will allow us to `guess' the values of pixels on an
output pixel grid that has smaller pixel sizes and thus increase the resolution
of the output.
+This process of merging multiple observations is known as Mosaicing.
+@cindex Cosmic rays
+@item
+@strong{Cosmic rays:} Cosmic rays can randomly fall on any part of an image.
+If they collide vertically with the camera, they are going to create a very
sharp and bright spot that in most cases can be separated easily@footnote{All
astronomical targets are blurred with the PSF, see @ref{PSF}, however a cosmic
ray is not and so it is very sharp (it suddenly stops at one pixel).}.
+However, depending on the depth of the camera pixels, and the angle that a
cosmic rays collides with it, it can cover a line-like larger area on the CCD
which makes the detection using their sharp edges very hard and error prone.
+One of the best methods to remove cosmic rays is to compare multiple images of
the same field.
+To do that, we need all the images to be on the same pixel grid.
-@node Linear warps to be called explicitly, , Align pixels with WCS
considering distortions, Invoking astwarp
-@subsubsection Linear warps to be called explicitly
+@cindex Optical distortion
+@cindex Distortion, optical
+@item
+@strong{Optical distortion:} In wide field images, the optical distortion that
occurs on the outer parts of the focal plane will make accurate comparison of
the objects at various locations impossible.
+It is therefore necessary to warp the image and correct for those distortions
prior to the analysis.
-Linear warps include operations like rotation, scaling, sheer, etc.
-For an introduction, see @ref{Linear warping basics}.
-These are warps that don't depend on the WCS of the image and should be
explicitly requested.
-To align the input pixel coordinates with the WCS coordinates, see @ref{Align
pixels with WCS considering distortions}.
+@cindex ACS
+@cindex CCD
+@cindex WFC3
+@cindex Wide Field Camera 3
+@cindex Charge-coupled device
+@cindex Advanced camera for surveys
+@cindex Hubble Space Telescope (HST)
+@item
+@strong{Detector not on focal plane:} In some cases (like the Hubble Space
Telescope ACS and WFC3 cameras), the CCD might be tilted compared to the focal
plane, therefore the recorded CCD pixels have to be projected onto the focal
plane before further analysis.
-While they will correct any existing WCS based on the warp, they can also
operate on images without any WCS.
-For example, you have a mock image that doesn't (yet!) have its mock WCS, and
it has been created on an over-sampled grid and convolved with an over-sampled
PSF.
-In this scenario, you can use the @option{--scale} option to under-sample it
to your desired resolution.
-This is similar to the @ref{Sufi simulates a detection} tutorial.
+@end itemize
-Linear warps must be specified as command-line options, either as (possibly
multiple) modular warpings (for example, @option{--rotate}, or
@option{--scale}), or directly as a single raw matrix (with @option{--matrix}).
-If specified together, the latter (direct matrix) will take precedence and all
the modular warpings will be ignored.
-Any number of modular warpings can be specified on the command-line and
configuration files.
-If more than one modular warping is given, all will be merged to create one
warping matrix.
-As described in @ref{Merging multiple warpings}, matrix multiplication is not
commutative, so the order of specifying the modular warpings on the
command-line, and/or configuration files makes a difference (see
@ref{Configuration file precedence}).
-The full list of modular warpings and the other options particular to Warp are
described below.
+@menu
+* Linear warping basics:: Basics of coordinate transformation.
+* Merging multiple warpings:: How to merge multiple matrices.
+* Resampling:: Warping an image is re-sampling it.
+* Moire pattern and its correction:: Spatial resonance of the grid pattern on
output.
+* Invoking astwarp:: Arguments and options for Warp.
+@end menu
-The values to the warping options (modular warpings as well as
@option{--matrix}), are a sequence of at least one number.
-Each number in this sequence is separated from the next by a comma (@key{,}).
-Each number can also be written as a single fraction (with a forward-slash
@key{/} between the numerator and denominator).
-Space and Tab characters are permitted between any two numbers, just do not
forget to quote the whole value.
-Otherwise, the value will not be fully passed onto the option.
-See the examples above as a demonstration.
+@node Linear warping basics, Merging multiple warpings, Warp, Warp
+@subsection Linear warping basics
-@cindex FITS standard
-Based on the FITS standard, integer values are assigned to the center of a
pixel and the coordinate [1.0, 1.0] is the center of the first pixel (bottom
left of the image when viewed in SAO DS9).
-So the coordinate center [0.0, 0.0] is half a pixel away (in each axis) from
the bottom left vertex of the first pixel.
-The resampling that is done in Warp (see @ref{Resampling}) is done on the
coordinate axes and thus directly depends on the coordinate center.
-In some situations this if fine, for example, when rotating/aligning a real
image, all the edge pixels will be similarly affected.
-But in other situations (for example, when scaling an over-sampled mock image
to its intended resolution, this is not desired: you want the center of the
coordinates to be on the corner of the pixel.
-In such cases, you can use the @option{--centeroncorner} option which will
shift the center by @mymath{0.5} before the main warp, then shift it back by
@mymath{-0.5} after the main warp.
+@cindex Scaling
+@cindex Coordinate transformation
+Let's take @mymath{\left[\matrix{u&v}\right]} as the coordinates of a point in
the input image and @mymath{\left[\matrix{x&y}\right]} as the coordinates of
that same point in the output image@footnote{These can be any real number, we
are not necessarily talking about integer pixels here.}.
+The simplest form of coordinate transformation (or warping) is the scaling of
the coordinates, let's assume we want to scale the first axis by @mymath{M} and
the second by @mymath{N}, the output coordinates of that point can be
calculated by
-@table @option
+@dispmath{\left[\matrix{x\cr y}\right]=
+ \left[\matrix{Mu\cr Nv}\right]=
+ \left[\matrix{M&0\cr0&N}\right]\left[\matrix{u\cr v}\right]}
-@item -r FLT
-@itemx --rotate=FLT
-Rotate the input image by the given angle in degrees: @mymath{\theta} in
@ref{Linear warping basics}.
-Note that commonly, the WCS structure of the image is set such that the RA is
the inverse of the image horizontal axis which increases towards the right in
the FITS standard and as viewed by SAO DS9.
-So the default center for rotation is on the right of the image.
-If you want to rotate about other points, you have to translate the warping
center first (with @option{--translate}) then apply your rotation and then
return the center back to the original position (with another call to
@option{--translate}, see @ref{Merging multiple warpings}.
+@cindex Matrix
+@cindex Multiplication, Matrix
+@cindex Rotation of coordinates
+@noindent
+Note that these are matrix multiplications.
+We thus see that we can represent any such grid warping as a matrix.
+Another thing we can do with this @mymath{2\times2} matrix is to rotate the
output coordinate around the common center of both coordinates.
+If the output is rotated anticlockwise by @mymath{\theta} degrees from the
positive (to the right) horizontal axis, then the warping matrix should become:
-@item -s FLT[,FLT]
-@itemx --scale=FLT[,FLT]
-Scale the input image by the given factor(s): @mymath{M} and @mymath{N} in
@ref{Linear warping basics}.
-If only one value is given, then both image axes will be scaled with the given
value.
-When two values are given (separated by a comma), the first will be used to
scale the first axis and the second will be used for the second axis.
-If you only need to scale one axis, use @option{1} for the axis you do not
need to scale.
-The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
+@dispmath{\left[\matrix{x\cr y}\right]=
+ \left[\matrix{ucos\theta-vsin\theta\cr usin\theta+vcos\theta}\right]=
+ \left[\matrix{cos\theta&-sin\theta\cr sin\theta&cos\theta}\right]
+ \left[\matrix{u\cr v}\right]
+ }
-@item -f FLT[,FLT]
-@itemx --flip=FLT[,FLT]
-Flip the input image around the given axis(s).
-If only one value is given, then both image axes are flipped.
-When two values are given (separated by acomma), you can choose which axis to
flip over.
-@option{--flip} only takes values @code{0} (for no flip), or @code{1} (for a
flip).
-Hence, if you want to flip by the second axis only, use @option{--flip=0,1}.
+@cindex Flip coordinates
+@noindent
+We can also flip the coordinates around the first axis, the second axis and
the coordinate center with the following three matrices respectively:
-@item -e FLT[,FLT]
-@itemx --shear=FLT[,FLT]
-Shear the input image by the given value(s): @mymath{A} and @mymath{B} in
@ref{Linear warping basics}.
-If only one value is given, then both image axes will be sheared with the
given value.
-When two values are given (separated by a comma), the first will be used to
shear the first axis and the second will be used for the second axis.
-If you only need to shear along one axis, use @option{0} for the axis that
must be untouched.
-The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
+@dispmath{\left[\matrix{1&0\cr0&-1}\right]\quad\quad
+ \left[\matrix{-1&0\cr0&1}\right]\quad\quad
+ \left[\matrix{-1&0\cr0&-1}\right]}
-@item -t FLT[,FLT]
-@itemx --translate=FLT[,FLT]
-Translate (move the center of coordinates) the input image by the given
value(s): @mymath{c} and @mymath{f} in @ref{Linear warping basics}.
-If only one value is given, then both image axes will be translated by the
given value.
-When two values are given (separated by a comma), the first will be used to
translate the first axis and the second will be used for the second axis.
-If you only need to translate along one axis, use @option{0} for the axis that
must be untouched.
-The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
+@cindex Shear
+@noindent
+The final thing we can do with this definition of a @mymath{2\times2} warping
matrix is shear.
+If we want the output to be sheared along the first axis with @mymath{A} and
along the second with @mymath{B}, then we can use the matrix:
-@item -p FLT[,FLT]
-@itemx --project=FLT[,FLT]
-Apply a projection to the input image by the given values(s): @mymath{g} and
@mymath{h} in @ref{Linear warping basics}.
-If only one value is given, then projection will apply to both axes with the
given value.
-When two values are given (separated by a comma), the first will be used to
project the first axis and the second will be used for the second axis.
-If you only need to project along one axis, use @option{0} for the axis that
must be untouched.
-The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
+@dispmath{\left[\matrix{1&A\cr B&1}\right]}
-@item -m STR
-@itemx --matrix=STR
-The warp/transformation matrix.
-All the elements in this matrix must be separated by commas(@key{,})
characters and as described above, you can also use fractions (a forward-slash
between two numbers).
-The transformation matrix can be either a 2 by 2 (4 numbers), or a 3 by 3 (9
numbers) array.
-In the former case (if a 2 by 2 matrix is given), then it is put into a 3 by 3
matrix (see @ref{Linear warping basics}).
+@noindent
+To have one matrix representing any combination of these steps, you use matrix
multiplication, see @ref{Merging multiple warpings}.
+So any combinations of these transformations can be displayed with one
@mymath{2\times2} matrix:
-@cindex NaN
-The determinant of the matrix has to be non-zero and it must not contain any
non-number values (for example, infinities or NaNs).
-The elements of the matrix have to be written row by row.
-So for the general Homography matrix of @ref{Linear warping basics}, it should
be called with @command{--matrix=a,b,c,d,e,f,g,h,1}.
+@dispmath{\left[\matrix{a&b\cr c&d}\right]}
-The raw matrix takes precedence over all the modular warping options listed
above, so if it is called with any number of modular warps, the latter are
ignored.
+@cindex Wide Field Camera 3
+@cindex Advanced Camera for Surveys
+@cindex Hubble Space Telescope (HST)
+The transformations above can cover a lot of the needs of most coordinate
transformations.
+However they are limited to mapping the point @mymath{[\matrix{0&0}]} to
@mymath{[\matrix{0&0}]}.
+Therefore they are useless if you want one coordinate to be shifted compared
to the other one.
+They are also space invariant, meaning that all the coordinates in the image
will receive the same transformation.
+In other words, all the pixels in the output image will have the same area if
placed over the input image.
+So transformations which require varying output pixel sizes like projections
cannot be applied through this @mymath{2\times2} matrix either (for example,
for the tilted ACS and WFC3 camera detectors on board the Hubble space
telescope).
-@item --centeroncorner
-Put the center of coordinates on the corner of the first (bottom-left when
viewed in SAO DS9) pixel.
-This option is applied after the final warping matrix has been finalized:
either through modular warpings or the raw matrix.
-See the explanation above for coordinates in the FITS standard to better
understand this option and when it should be used.
+@cindex M@"obius, August. F.
+@cindex Homogeneous coordinates
+@cindex Coordinates, homogeneou
+To add these further capabilities, namely translation and projection, we use
the homogeneous coordinates.
+They were defined about 200 years ago by August Ferdinand M@"obius (1790 --
1868).
+For simplicity, we will only discuss points on a 2D plane and avoid the
complexities of higher dimensions.
+We cannot provide a deep mathematical introduction here, interested readers
can get a more detailed explanation from
Wikipedia@footnote{@url{http://en.wikipedia.org/wiki/Homogeneous_coordinates}}
and the references therein.
-@item -k
-@itemx --keepwcs
-@cindex WCSLIB
-@cindex World Coordinate System
-Do not correct the WCS information of the input image and save it untouched to
the output image.
-By default the WCS (World Coordinate System) information of the input image is
going to be corrected in the output image so the objects in the image are at
the same WCS coordinates.
-But in some cases it might be useful to keep it unchanged (for example, to
correct alignments).
-@end table
+By adding an extra coordinate to a point we can add the flexibility we need.
+The point @mymath{[\matrix{x&y}]} can be represented as
@mymath{[\matrix{xZ&yZ&Z}]} in homogeneous coordinates.
+Therefore multiplying all the coordinates of a point in the homogeneous
coordinates with a constant will give the same point.
+Put another way, the point @mymath{[\matrix{x&y&Z}]} corresponds to the point
@mymath{[\matrix{x/Z&y/Z}]} on the constant @mymath{Z} plane.
+Setting @mymath{Z=1}, we get the input image plane, so
@mymath{[\matrix{u&v&1}]} corresponds to @mymath{[\matrix{u&v}]}.
+With this definition, the transformations above can be generally written as:
+@dispmath{\left[\matrix{x\cr y\cr 1}\right]=
+ \left[\matrix{a&b&0\cr c&d&0\cr 0&0&1}\right]
+ \left[\matrix{u\cr v\cr 1}\right]}
+@noindent
+@cindex Affine Transformation
+@cindex Transformation, affine
+We thus acquired 4 extra degrees of freedom.
+By giving non-zero values to the zero valued elements of the last column we
can have translation (try the matrix multiplication!).
+In general, any coordinate transformation that is represented by the matrix
below is known as an affine
transformation@footnote{@url{http://en.wikipedia.org/wiki/Affine_transformation}}:
+@dispmath{\left[\matrix{a&b&c\cr d&e&f\cr 0&0&1}\right]}
+@cindex Homography
+@cindex Projective transformation
+@cindex Transformation, projective
+We can now consider translation, but the affine transform is still spatially
invariant.
+Giving non-zero values to the other two elements in the matrix above gives us
the projective transformation or
Homography@footnote{@url{http://en.wikipedia.org/wiki/Homography}} which is the
most general type of transformation with the @mymath{3\times3} matrix:
+@dispmath{\left[\matrix{x'\cr y'\cr w}\right]=
+ \left[\matrix{a&b&c\cr d&e&f\cr g&h&1}\right]
+ \left[\matrix{u\cr v\cr 1}\right]}
+@noindent
+So the output coordinates can be calculated from:
+@dispmath{x={x' \over w}={au+bv+c \over gu+hv+1}\quad\quad\quad\quad
+ y={y' \over w}={du+ev+f \over gu+hv+1}}
+Thus with Homography we can change the sizes of the output pixels on the input
plane, giving a `perspective'-like visual impression.
+This can be quantitatively seen in the two equations above.
+When @mymath{g=h=0}, the denominator is independent of @mymath{u} or
@mymath{v} and thus we have spatial invariance.
+Homography preserves lines at all orientations.
+A very useful fact about Homography is that its inverse is also a Homography.
+These two properties play a very important role in the implementation of this
transformation.
+A short but instructive and illustrated review of affine, projective and also
bi-linear mappings is provided in Heckbert 1989@footnote{
+Paul S. Heckbert. 1989. @emph{Fundamentals of Texture mapping and Image
Warping}, Master's thesis at University of California, Berkeley.
+Note that since points are defined as row vectors there, the matrix is the
transpose of the one discussed here.}.
+@node Merging multiple warpings, Resampling, Linear warping basics, Warp
+@subsection Merging multiple warpings
+@cindex Commutative property
+@cindex Matrix multiplication
+@cindex Multiplication, matrix
+@cindex Non-commutative operations
+@cindex Operations, non-commutative
+In @ref{Linear warping basics} we saw how a basic warp/transformation can be
represented with a matrix.
+To make more complex warpings (for example, to define a translation, rotation
and scale as one warp) the individual matrices have to be multiplied through
matrix multiplication.
+However matrix multiplication is not commutative, so the order of the set of
matrices you use for the multiplication is going to be very important.
+The first warping should be placed as the left-most matrix.
+The second warping to the right of that and so on.
+The second transformation is going to occur on the warped coordinates of the
first.
+As an example for merging a few transforms into one matrix, the multiplication
below represents the rotation of an image about a point @mymath{[\matrix{U&V}]}
anticlockwise from the horizontal axis by an angle of @mymath{\theta}.
+To do this, first we take the origin to @mymath{[\matrix{U&V}]} through
translation.
+Then we rotate the image, then we translate it back to where it was initially.
+These three operations can be merged in one operation by calculating the
matrix multiplication below:
+@dispmath{\left[\matrix{1&0&U\cr0&1&V\cr{}0&0&1}\right]
+ \left[\matrix{cos\theta&-sin\theta&0\cr sin\theta&cos\theta&0\cr
0&0&1}\right]
+ \left[\matrix{1&0&-U\cr0&1&-V\cr{}0&0&1}\right]}
+@node Resampling, Moire pattern and its correction, Merging multiple warpings,
Warp
+@subsection Resampling
+@cindex Pixel
+@cindex Camera
+@cindex Detector
+@cindex Sampling
+@cindex Resampling
+@cindex Pixel mixing
+@cindex Photoelectrons
+@cindex Picture element
+@cindex Mixing pixel values
+A digital image is composed of discrete `picture elements' or `pixels'.
+When a real image is created from a camera or detector, each pixel's area is
used to store the number of photo-electrons that were created when incident
photons collided with that pixel's surface area.
+This process is called the `sampling' of a continuous or analog data into
digital data.
+When we change the pixel grid of an image, or ``warp'' it, we have to
calculate the flux value of each pixel on the new grid based on the old grid,
or resample it.
+Because of the calculation (as opposed to observation), any form of warping on
the data is going to degrade the image and mix the original pixel values with
each other.
+So if an analysis can be done on an unwarped data image, it is best to leave
the image untouched and pursue the analysis.
+However as discussed in @ref{Warp} this is not possible in some scenarios and
re-sampling is necessary.
-@node Data analysis, Data modeling, Data manipulation, Top
-@chapter Data analysis
+@cindex Point pixels
+@cindex Interpolation
+@cindex Sampling theorem
+@cindex Bicubic interpolation
+@cindex Signal to noise ratio
+@cindex Bi-linear interpolation
+@cindex Interpolation, bicubic
+@cindex Interpolation, bi-linear
+When the FWHM of the PSF of the camera is much larger than the pixel scale
(see @ref{Sampling theorem}) we are sampling the signal in a much higher
resolution than the camera can offer.
+This is usually the case in many applications of image processing
(nonastronomical imaging).
+In such cases, we can consider each pixel to be a point and not an area: the
PSF doesn't vary much over a single pixel.
-Astronomical datasets (images or tables) contain very valuable information,
the tools in this section can help in analyzing, extracting, and quantifying
that information.
-For example, getting general or specific statistics of the dataset (with
@ref{Statistics}), detecting signal within a noisy dataset (with
@ref{NoiseChisel}), or creating a catalog from an input dataset (with
@ref{MakeCatalog}).
+Approximating a pixel's area to a point can significantly speed up the
resampling and also the simplicity of the code.
+Because resampling becomes a problem of interpolation: points of the input
grid need to be interpolated at certain other points (over the output grid).
+To increase the accuracy, you might also sample more than one point from
within a pixel giving you more points for a more accurate interpolation in the
output grid.
-@menu
-* Statistics:: Calculate dataset statistics.
-* NoiseChisel:: Detect objects in an image.
-* Segment:: Segment detections based on signal structure.
-* MakeCatalog:: Catalog from input and labeled images.
-* Match:: Match two datasets.
-@end menu
+@cindex Image edges
+@cindex Edges, image
+However, interpolation has several problems.
+The first one is that it will depend on the type of function you want to
assume for the interpolation.
+For example, you can choose a bi-linear or bi-cubic (the `bi's are for the 2
dimensional nature of the data) interpolation method.
+For the latter there are various ways to set the constants@footnote{see
@url{http://entropymine.com/imageworsener/bicubic/} for a nice introduction.}.
+Such parametric interpolation functions can fail seriously on the edges of an
image, or when there is a sharp change in value (for example, the bleeding
saturation of bright stars in astronomical CCDs).
+They will also need normalization so that the flux of the objects before and
after the warping is comparable.
-@node Statistics, NoiseChisel, Data analysis, Data analysis
-@section Statistics
+The parametric nature of these methods adds a level of subjectivity to the
data (it makes more assumptions through the functions than the data can handle).
+For most applications this is fine (as discussed above: when the PSF is
over-sampled), but in scientific applications where we push our instruments to
the limit and the aim is the detection of the faintest possible galaxies or
fainter parts of bright galaxies, we cannot afford this loss.
+Because of these reasons Warp will not use parametric interpolation techniques.
-The distribution of values in a dataset can provide valuable information about
it.
-For example, in an image, if it is a positively skewed distribution, we can
see that there is significant data in the image.
-If the distribution is roughly symmetric, we can tell that there is no
significant data in the image.
-In a table, when we need to select a sample of objects, it is important to
first get a general view of the whole sample.
+@cindex Drizzle
+@cindex Pixel mixing
+@cindex Exact area resampling
+Warp will do interpolation based on ``pixel mixing''@footnote{For a graphic
demonstration see @url{http://entropymine.com/imageworsener/pixelmixing/}.} or
``area resampling''.
+This is also similar to what the Hubble Space Telescope pipeline calls
``Drizzling''@footnote{@url{http://en.wikipedia.org/wiki/Drizzle_(image_processing)}}.
+This technique requires no functions, it is thus non-parametric.
+It is also the closest we can get (make least assumptions) to what actually
happens on the detector pixels.
-On the other hand, you might need to know certain statistical parameters of
the dataset.
-For example, if we have run a detection algorithm on an image, and we want to
see how accurate it was, one method is to calculate the average of the
undetected pixels and see how reasonable it is (if detection is done correctly,
the average of undetected pixels should be approximately equal to the
background value, see @ref{Sky value}).
-In a table, you might have calculated the magnitudes of a certain class of
objects and want to get some general characteristics of the distribution
immediately on the command-line (very fast!), to possibly change some
parameters.
-The Statistics program is designed for such situations.
+In pixel mixing, the basic idea is that you reverse-transform each output
pixel to find which pixels of the input image it covers, and what fraction of
the area of the input pixels are covered by that output pixel.
+We then multiply each input pixel's value by the fraction of its area that
overlaps with the output pixel (between 0 to 1).
+The output's pixel value is derived by summing all these multiplications for
the input pixels that it covers.
-@menu
-* Histogram and Cumulative Frequency Plot:: Basic definitions.
-* 2D Histograms:: Plotting the distribution of two variables.
-* Sigma clipping:: Definition of @mymath{\sigma}-clipping.
-* Least squares fitting:: Fitting with various parametric functions.
-* Sky value:: Definition and derivation of the Sky value.
-* Invoking aststatistics:: Arguments and options to Statistics.
-@end menu
+Through this process, pixels are treated as an area not as a point (which is
how detectors create the image), also the brightness (see @ref{Brightness flux
magnitude}) of an object will be fully preserved.
+Since it involves the mixing of the input's pixel values, this pixel mixing
method is a form of @ref{Spatial domain convolution}.
+Therefore, after comparing the input and output, you will notice that the
output is slightly smoothed, thus boosting the more diffuse signal, but
creating correlated noise.
+In astronomical imaging the correlated noise will be decreased later when you
stack many exposures@footnote{If you are working on a single exposure image and
see pronounced Moir@'e patterns after Warping, check @ref{Moire pattern and its
correction} for a possible way to reduce them}.
+If there are very high spatial-frequency signals in the image (for example,
fringes) which vary on a scale @emph{smaller than} your output image pixel size
(this is rarely the case in astronomical imaging), pixel mixing can cause
ailiasing@footnote{@url{http://en.wikipedia.org/wiki/Aliasing}}.
+Therefore, in case such fringes are present, they have to be calculated and
removed separately (which would naturally be done in any astronomical reduction
pipeline).
+Because of the PSF, no astronomical target has a sharp change in their signal.
+Thus this issue is less important for astronomical applications, see @ref{PSF}.
-@node Histogram and Cumulative Frequency Plot, 2D Histograms, Statistics,
Statistics
-@subsection Histogram and Cumulative Frequency Plot
+To find the overlap area of the output pixel over the input pixels, we need to
define polygons and clip them (find the overlap).
+Usually, it is sufficient to define a pixel with a four-vertice polygon.
+However, when a non-linear distortion (for example, @code{SIP} or @code{TPV})
is present and the distortion is significant over an output pixel's size
(usually far from the reference point), the shadow of the output pixel on the
input grid can be curved.
+To account for such cases (which can only happen when correcting for
non-linear distortions), Warp has the @option{--edgesampling} option to sample
the output pixel over more vertices.
+For more, see the description of this option in @ref{Align pixels with WCS
considering distortions}.
-@cindex Histogram
-Histograms and the cumulative frequency plots are both used to visually study
the distribution of a dataset.
-A histogram shows the number of data points which lie within pre-defined
intervals (bins).
-So on the horizontal axis we have the bin centers and on the vertical, the
number of points that are in that bin.
-You can use it to get a general view of the distribution: which values have
been repeated the most? how close/far are the most significant bins? Are there
more values in the larger part of the range of the dataset, or in the lower
part? Similarly, many very important properties about the dataset can be
deduced from a visual inspection of the histogram.
-In the Statistics program, the histogram can be either output to a table to
plot with your favorite plotting program@footnote{
-We recommend @url{http://pgfplots.sourceforge.net/,PGFPlots} which generates
your plots directly within @TeX{} (the same tool that generates your
document).},
-or it can be shown with ASCII characters on the command-line, which is very
crude, but good enough for a fast and on-the-go analysis, see the example in
@ref{Invoking aststatistics}.
+@node Moire pattern and its correction, Invoking astwarp, Resampling, Warp
+@subsection Moir@'e pattern and its correction
-@cindex Intervals, histogram
-@cindex Bin width, histogram
-@cindex Normalizing histogram
-@cindex Probability density function
-The width of the bins is only necessary parameter for a histogram.
-In the limiting case that the bin-widths tend to zero (while assuming the
number of points in the dataset tend to infinity), then the histogram will tend
to the @url{https://en.wikipedia.org/wiki/Probability_density_function,
probability density function} of the distribution.
-When the absolute number of points in each bin is not relevant to the study
(only the shape of the histogram is important), you can @emph{normalize} a
histogram so like the probability density function, the sum of all its bins
will be one.
+@cindex Moir@'e pattern or fringes
+After warping some images with the default mode of Warp (see @ref{Align pixels
with WCS considering distortions}) you may notice that the background noise is
no longer flat.
+Some regions will be smoother and some will be sharper; depending on the
orientation and distortion of the input/output pixel grids.
+This is due to the @url{https://en.wikipedia.org/wiki/Moir%C3%A9_pattern,
Moir@'e pattern}, which is especially noticeable/significant when two slightly
different grids are super-imposed.
-@cindex Cumulative Frequency Plot
-In the cumulative frequency plot of a distribution, the horizontal axis is the
sorted data values and the y axis is the index of each data in the sorted
distribution.
-Unlike a histogram, a cumulative frequency plot does not involve intervals or
bins.
-This makes it less prone to any sort of bias or error that a given bin-width
would have on the analysis.
-When a larger number of the data points have roughly the same value, then the
cumulative frequency plot will become steep in that vicinity.
-This occurs because on the horizontal axis, there is little change while on
the vertical axis, the indexes constantly increase.
-Normalizing a cumulative frequency plot means to divide each index (y axis) by
the total number of data points (or the last value).
+With the commands below, we'll download a single exposure image from the
@url{https://www.j-plus.es,J-PLUS survey} and run Warp (on a @mymath{8\times8}
arcmin@mymath{^2} region to speed it up the demos here).
+Finally, we'll open the image to visually see the artificial Moir@'e pattern
on the warped image.
-Unlike the histogram which has a limited number of bins, ideally the
cumulative frequency plot should have one point for every data element.
-Even in small datasets (for example, a @mymath{200\times200} image) this will
result in an unreasonably large number of points to plot (40000)! As a result,
for practical reasons, it is common to only store its value on a certain number
of points (intervals) in the input range rather than the whole dataset, so you
should determine the number of bins you want when asking for a cumulative
frequency plot.
-In Gnuastro (and thus the Statistics program), the number reported for each
bin is the total number of data points until the larger interval value for that
bin.
-You can see an example histogram and cumulative frequency plot of a single
dataset under the @option{--asciihist} and @option{--asciicfp} options of
@ref{Invoking aststatistics}.
+@example
+## Download the image (73.7 MB containing an 9216x9232 pixel image)
+$ jplusdr2=http://archive.cefca.es/catalogues/vo/siap/jplus-dr2/reduced
+$ wget $jplusdr2/get_fits?id=771463 -Ojplus-exp1.fits.fz
-So as a summary, both the histogram and cumulative frequency plot in
Statistics will work with bins.
-Within each bin/interval, the lower value is considered to be within then bin
(it is inclusive), but its larger value is not (it is exclusive).
-Formally, an interval/bin between a and b is represented by [a, b).
-When the over-all range of the dataset is specified (with the
@option{--greaterequal}, @option{--lessthan}, or @option{--qrange} options),
the acceptable values of the dataset are also defined with a similar
inclusive-exclusive manner.
-But when the range is determined from the actual dataset (none of these
options is called), the last element in the dataset is included in the last
bin's count.
+## Align a small part of it with the sky coordinates.
+$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
+ --width=8/60 -ojplus-e1.fits
-@node 2D Histograms, Sigma clipping, Histogram and Cumulative Frequency Plot,
Statistics
-@subsection 2D Histograms
-@cindex 2D histogram
-@cindex Histogram, 2D
-In @ref{Histogram and Cumulative Frequency Plot} the concept of histograms
were introduced on a single dataset.
-But they are only useful for viewing the distribution of a single variable
(column in a table).
-In many contexts, the distribution of two variables in relation to each other
may be of interest.
-For example, the color-magnitude diagrams in astronomy, where the horizontal
axis is the luminosity or magnitude of an object, and the vertical axis is the
color.
-Scatter plots are useful to see these relations between the objects of
interest when the number of the objects is small.
-
-As the density of points in the scatter plot increases, the points will fall
over each other and just make a large connected region hide potentially
interesting behaviors/correlations in the densest regions.
-This is where 2D histograms can become very useful.
-A 2D histogram is composed of 2D bins (boxes or pixels), just as a 1D
histogram consists of 1D bins (lines).
-The number of points falling within each box/pixel will then be the value of
that box.
-Added with a color-bar, you can now clearly see the distribution independent
of the density of points (for example, you can even visualize it in log-scale
if you want).
-
-Gnuastro's Statistics program has the @option{--histogram2d} option for this
task.
-It takes a single argument (either @code{table} or @code{image}) that
specifies the format of the output 2D histogram.
-The two formats will be reviewed separately in the sub-sections below.
-But let's start with the generalities that are common to both (related to the
input, not the output).
+## Open the aligned region with DS9
+$ astscript-fits-view jplus-e1.fits
+@end example
-You can specify the two columns to be shown using the @option{--column} (or
@option{-c}) option.
-So if you want to plot the color-magnitude diagram from a table with the
@code{MAG-R} column on the horizontal and @code{COLOR-G-R} on the vertical
column, you can use @option{--column=MAG-r,COLOR-G-r}.
-The number of bins along each dimension can be set with @option{--numbins}
(for first input column) and @option{--numbins2} (for second input column).
+In the opened DS9 window, you can see the Moir@'e pattern as wave-like
patterns in the noise: some parts of the noise are more smooth and some parts
are more sharp.
+Right in the center of the image is a blob of sharp noise.
+Warp has the @option{--checkmaxfrac} option for direct inspection of the
Moir@'e pattern (described with the other options in @ref{Align pixels with WCS
considering distortions}).
+When run with this option, an extra HDU (called @code{MAX-FRAC}) will be added
to the output.
+The image in this HDU has the same size as the output.
+However, each output pixel will contain the largest (maximum) fraction of area
that it covered over the input pixel grid.
+So if an output pixel has a value of 0.9, this shows that it covered
@mymath{90\%} of an input pixel.
+Let's run Warp with @option{--checkmaxfrac} and see the output (after DS9
opens, in the ``Cube'' window, flip between the first and second HDUs):
-Without specifying any range, the full range of values will be used in each
dimension.
-If you only want to focus on a certain interval of the values in the columns
in any dimension you can use the @option{--greaterequal} and
@option{--lessthan} options to limit the values along the first/horizontal
dimension and @option{--greaterequal2} and @option{--lessthan2} options for the
second/vertical dimension.
+@example
+$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
+ --width=8/60 -ojplus-e1.fits --checkmaxfrac
-@menu
-* 2D histogram as a table for plotting:: Format and usage in table format.
-* 2D histogram as an image:: Format and usage in image format
-@end menu
+$ astscript-fits-view jplus-e1.fits
+@end example
-@node 2D histogram as a table for plotting, 2D histogram as an image, 2D
Histograms, 2D Histograms
-@subsubsection 2D histogram as a table for plotting
+By comparing the first and second HDUs/extensions, you will clearly see that
the regions with a sharp noise pattern fall exactly on parts of the
@code{MAX-FRAC} extension with values larger than 0.5.
+In other words, output pixels where one input pixel contributed more than half
of the its value.
+As this fraction increases, the sharpness also increases because a single
input pixel's value dominates the value of the output pixel.
+On the other hand, when this value is small, we see that many input pixels
contribute to that output pixel.
+Since many input pixels contribute to an output pixel, it acts like a
convolution, hence that output pixel becomes smoother (see @ref{Spatial domain
convolution}).
+Let's have a look at the distribution of the @code{MAX-FRAC} pixel values:
-When called with the @option{--histogram=table} option, Statistics will output
a table file with three columns that have the information of every box as a
column.
-If you asked for @option{--numbins=N} and @option{--numbins2=M}, all three
columns will have @mymath{M\times N} rows (one row for every box/pixel of the
2D histogram).
-The first and second columns are the position of the box along the first and
second dimensions.
-The third column has the number of input points that fall within that
box/pixel.
+@example
+$ aststatistics jplus-e1.fits -hMAX-FRAC
+Statistics (GNU Astronomy Utilities) @value{VERSION}
+-------
+Input: jplus-e1.fits (hdu: MAX-FRAC)
+-------
+ Number of elements: 744769
+ Minimum: 0.250213461
+ Maximum: 0.9987495374
+ Mode: 0.5034223567
+ Mode quantile: 0.3773819498
+ Median: 0.5520805544
+ Mean: 0.5693956458
+ Standard deviation: 0.1554693738
+-------
+Histogram:
+ | ***
+ | **********
+ | *****************
+ | ************************
+ | *******************************
+ | **************************************
+ | *********************************************
+ | ****************************************************
+ | ***********************************************************
+ | ******************************************************************
+ |**********************************************************************
+ |----------------------------------------------------------------------
+@end example
-For example, you can make high-quality plots within your paper (using the same
@LaTeX{} engine, thus blending very nicely with your text) using
@url{https://ctan.org/pkg/pgfplots, PGFPlots}.
-Below you can see one such minimal example, using your favorite text editor,
save it into a file, make the two small corrections in it, then run the
commands shown at the top.
-This assumes that you have @LaTeX{} installed, if not the steps to install a
minimally sufficient @LaTeX{} package on your system, see the respective
section in @ref{Bootstrapping dependencies}.
+The smallest value is 0.25 (=1/4), showing that 4 input pixels contributed to
the output pixels value.
+While the maximum is almost 1.0, showing that a single input pixel defined the
output pixel value.
+You can also see that the most probable value (the mode) is 0.5, and that the
distribution is positively skewed.
-The two parts that need to be corrected are marked with '@code{%% <--}': the
first one (@code{XXXXXXXXX}) should be replaced by the value to the
@option{--numbins} option which is the number of bins along the first dimension.
-The second one (@code{FILE.txt}) should be replaced with the name of the file
generated by Statistics.
+@cindex Pixel scale
+@cindex @code{CDELT}
+This is a well-known problem in astronomical imaging and professional
photography.
+If you only have a single image (that is already taken!), you can undersample
the input: set the angular size of the output pixels to be larger than the
input.
+This will decrease the resolution of your image, but will ensure that
pixel-mixing will always happen.
+In the example below we are setting the output pixel scale (which is known as
@code{CDELT} in the FITS standard) to @mymath{1/0.5=2} of the input's.
+In other words each output pixel edge will cover double the input pixel's edge
on the sky, and the output's number of pixels in each dimension will be half of
the previous output.
@example
-%% Replace 'XXXXXXXXX' with your selected number of bins in the first
-%% dimension.
-%%
-%% Then run these commands to build the plot in a LaTeX command.
-%% mkdir tikz
-%% pdflatex --shell-escape --halt-on-error report.tex
-\documentclass@{article@}
-
-%% Load PGFPlots and set it to build the figure separately in a 'tikz'
-%% directory (which has to exist before LaTeX is run). This
-%% "externalization" is very useful to include the commands of multiple
-%% plots in the middle of your paper/report, but also have the plots
-%% separately to use in slides or other scenarios.
-\usepackage@{pgfplots@}
-\usetikzlibrary@{external@}
-\tikzexternalize
-\tikzsetexternalprefix@{tikz/@}
-
-%% Define colormap for the PGFPlots 2D histogram
-\pgfplotsset@{
- /pgfplots/colormap=@{hsvwhitestart@}@{
- rgb255(0cm)=(255,255,255)
- rgb255(0.10cm)=(128,0,128)
- rgb255(0.5cm)=(0,0,230)
- rgb255(1.cm)=(0,255,255)
- rgb255(2.5cm)=(0,255,0)
- rgb255(3.5cm)=(255,255,0)
- rgb255(6cm)=(255,0,0)
- @}
-@}
+$ cdelt=$(astfits jplus-exp1.fits.fz --pixelscale -q \
+ | awk '@{print $1@}')
+$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
+ --width=8/60 -ojplus-e1.fits --cdelt=$cdelt/0.5 \
+ --checkmaxfrac
+@end example
-%% Start the prinable document
-\begin@{document@}
+In the first extension, you can hardly see any Moir@'e pattern in the noise.
+When you go to the next (@code{MAX-FRAC}) extension, you will see that almost
all the pixels have a value of 1.
+Of course, decreasing the resolution by half is a little too drastic.
+Depending on your image, you may be able to reach a sufficiently good result
without such a drastic degrading of the input image.
+For example, if you want an output pixel scale that is just 1.5 times larger
than the input, you can divide the original coordinate-delta (or ``cdelt'') by
@mymath{1/1.5=0.6666} and try again.
+In the @code{MAX-FRAC} extension, you will see that the range of pixel values
is now between 0.56 to 1.0 (recall that originally, this was between 0.25 and
1.0).
+This shows that the pixels are more similarly mixed and in fact, when you look
at the actual warped image, you can hardly distinguish any Moir@'e pattern in
the noise.
- You can write a full paper here and include many figures!
- Describe what the two axes are, and how you measured them.
- Also, do not forget to explain what it shows and how to interpret it.
- You also have separate PDFs for every figure in the `tikz' directory.
- Feel free to change this text.
+@cindex Stacking
+@cindex Dithering
+@cindex Coaddition
+However, deep astronomical data are usually built by several exposures
(images), not a single one.
+Each image is also taken by (slightly) shifting the telescope compared to the
previous exposure.
+This shift is known as ``dithering''.
+We do this for many reasons (for example tracking errors in the telescope,
high background values, removing the effect of bad pixels or those affected by
cosmic rays, robust flat pattern measurement, etc.@footnote{E.g.,
@url{https://www.stsci.edu/hst/instrumentation/wfc3/proposing/dithering-strategies}}).
+One of those ``etc.'' reasons is to correct the Moir@'e pattern in the final
coadded deep image.
- %% Draw the plot.
- \begin@{tikzpicture@}
- \small
- \begin@{axis@}[
- width=\linewidth,
- view=@{0@}@{90@},
- colorbar horizontal,
- xlabel=X axis,
- ylabel=Y axis,
- ylabel shift=-0.1cm,
- colorbar style=@{at=@{(0,1.01)@}, anchor=south west,
- xticklabel pos=upper@},
- ]
- \addplot3[
- surf,
- shader=flat corner,
- mesh/ordering=rowwise,
- mesh/cols=XXXXXXXXX, %% <-- Number of bins in 1st column.
- ] file @{FILE.txt@}; %% <-- Name of aststatistics output.
+The Moir@'e pattern is fixed to the grid of the image, slightly shifting the
telescope will result in the pattern appearing in different parts of the sky.
+Therefore when we later stack, or coadd, the separate exposures into a deep
image, the Moir@'e pattern will be decreased there.
+However, dithering has possible drawbacks based on the scientific goal.
+For example when observing time-variable phenomena where cutting the exposures
to several shorter ones is not feasible.
+If this is not the case for you (for example in galaxy evolution), continue
with the rest of this section.
- \end@{axis@}
-\end@{tikzpicture@}
+Because we have multiple exposures that are slightly (sub-pixel) shifted, we
can also increase the spatial resolution of the output.
+For example, let's set the output coordinate-delta (or pixel scale) to be 1/2
of the input.
+In other words, the number of pixels in each dimension of the output is double
the first Warp command of this section:
-%% End the printable document.
-\end@{document@}
-@end example
+@example
+$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
+ --width=8/60 -ojplus-e1.fits --cdelt=$cdelt/2 \
+ --checkmaxfrac
-Let's assume you have put the @LaTeX{} source above, into a plain-text file
called @file{report.tex}.
-The PGFPlots call above is configured to build the plots as separate PDF files
in a @file{tikz/} directory@footnote{@url{https://www.ctan.org/pkg/pgf, TiKZ}
is the name of the lower-level engine behind PGPlots.}.
-This allows you to directly load those PDFs in your slides or other reports.
-Therefore, before building the PDF report, you should first make a
@file{tikz/} directory:
+$ aststatistics jplus-e1.fits -hMAX-FRAC --minimum --maximum
+0.06263604388 0.2506802701
-@example
-$ mkdir tikz
+$ astscript-fits-view jplus-e1.fits
@end example
-To build the final PDF, you should run @command{pdflatex} with the
@option{--shell-escape} option, so it can build the separate PDF(s) separately.
-We are also adding the @option{--halt-on-error} so it immediately aborts in
the case of an error (in the case of an error, by default @LaTeX{} will not
abort, it will stop and ask for your input to temporarily change things and try
fixing the error, but it has a special interface which can be hard to master).
+From the last command, you see that like the previous change in
@option{--cdelt}, the range of @code{MAX-FRAC} has decreased.
+However, when you look at the warped image and the @code{MAX-FRAC} image with
the last command, you still visually see the Moir@'e pattern in the noise
(although it has significantly decreased compared to the original resolution).
+It is still present because 2 is an exact multiple of 1.
+Let's try increasing the resolution (oversampling) by a factor of 1.25 (which
isn't an exact multiple of 1):
@example
-$ pdflatex --shell-escape --halt-on-error report.tex
+$ astwarp jplus-exp1.fits.fz --center=107.62920,39.72472 \
+ --width=8/60 -ojplus-e1.fits --cdelt=$cdelt/1.25 \
+ --checkmaxfrac
+$ astscript-fits-view jplus-e1.fits
@end example
-@noindent
-You can now open @file{report.pdf} to see your very high quality 2D histogram
within your text.
-And if you need the plots separately (for example, for slides), you can take
the PDF inside the @file{tikz/} directory.
-
-@node 2D histogram as an image, , 2D histogram as a table for plotting, 2D
Histograms
-@subsubsection 2D histogram as an image
-
-When called with the @option{--histogram=image} option, Statistics will output
a FITS file with an image/array extension.
-If you asked for @option{--numbins=N} and @option{--numbins2=M} the image will
have a size of @mymath{N\times M} pixels (one pixel per 2D bin).
-Also, the FITS image will have a linear WCS that is scaled to the 2D bin size
along each dimension.
-So when you hover your mouse over any part of the image with a FITS viewer
(for example, SAO DS9), besides the number of points in each pixel, you can
directly also see ``coordinates'' of the pixels along the two axes.
-You can also use the optimized and fast FITS viewer features for many aspects
of visually inspecting the distributions (which we will not go into further).
+You don't see any Moir@'e pattern in the noise any more, but when you look at
the @code{MAX-FRAC} extension, you see it is very different from the ones you
had seen before.
+In the previous @code{MAX-FRAC} image, you could see large blobs of similar
values.
+But here, you see that the variation is almost on a pixel scale, and the
difference between one pixel to the next is not significant.
+This is why you don't see any Moir@'e pattern in the warped image.
-@cindex Color-magnitude diagram
-@cindex Diagram, Color-magnitude
-For example, let's assume you want to derive the color-magnitude diagram (CMD)
of the @url{http://uvudf.ipac.caltech.edu, UVUDF survey}.
-You can run the first command below to download the table with magnitudes of
objects in many filters and run the second command to see general column
metadata after it is downloaded.
+In J-PLUS, each part of the sky was observed with a three-point dithering
pattern.
+Let's download the other two exposures and warp the same region of the sky to
the same pixel grid (using the @option{--gridfile} feature).
+Then, let's open all three cropped images in one DS9 instance:
@example
-$ wget http://asd.gsfc.nasa.gov/UVUDF/uvudf_rafelski_2015.fits.gz
-$ asttable uvudf_rafelski_2015.fits.gz -i
-@end example
+$ wget $jplusdr2/get_fits?id=771465 -Ojplus-exp2.fits.fz
+$ wget $jplusdr2/get_fits?id=771467 -Ojplus-exp3.fits.fz
-Let's assume you want to find the color to be between the @code{F606W} and
@code{F775W} filters (roughly corresponding to the g and r filters in
ground-based imaging).
-However, the original table does not have color columns (there would be too
many combinations!).
-Therefore you can use the @ref{Column arithmetic} feature of Gnuastro's Table
program for deriving a new table with the @code{F775W} magnitude in one column
and the difference between the @code{F606W} and @code{F775W} on the other
column.
-With the second command, you can see the actual values if you like.
+$ astwarp jplus-exp2.fits.fz --gridfile jplus-e1.fits \
+ -o jplus-e2.fits --checkmaxfrac
+$ astwarp jplus-exp3.fits.fz --gridfile jplus-e1.fits \
+ -o jplus-e3.fits --checkmaxfrac
-@example
-$ asttable uvudf_rafelski_2015.fits.gz -cMAG_F775W \
- -c'arith MAG_F606W MAG_F775W -' \
- --colmetadata=ARITH_1,F606W-F775W,"AB mag" -ocmd.fits
-$ asttable cmd.fits
+$ astscript-fits-view jplus-e*.fits
@end example
@noindent
-You can now construct your 2D histogram as a @mymath{100\times100} pixel FITS
image with this command (assuming you want @code{F775W} magnitudes between 22
and 30, colors between -1 and 3 and 100 bins in each dimension).
-Note that without the @option{--manualbinrange} option the range of each axis
will be determined by the values within the columns (which may be larger or
smaller than your desired large).
+In the three warped images, you don't see any Moir@'e pattern, so far so
good...
+now, take the following steps:
+@enumerate
+@item
+Click on the ``Frame'' button (in the top row of buttons just on top of the
image), and select the ``Single'' button in the bottom row.
+@item
+Open the ``Zoom'' menu, and select ``Zoom 16''.
+@item
+In the bottom row of buttons right on top of the image, press the ``next''
button to flip through each exposure's @code{MAX-FRAC} extension.
+@item
+Focus your eyes on the pixels with the largest value (white colored pixels),
while pressing the ``next'' button to flip between the exposures.
+You will see that in each exposure they cover different pixels.
+@end enumerate
+
+The exercise above shows that the effect varying smoothing level (that had
already shrank to a per-pixel level) will be further decreased after we stack
the images.
+So let's stack these three images with the commands below.
+First, we need to remove the sky-level from each image using
@ref{NoiseChisel}, then we'll stack the @code{INPUT-NO-SKY} extensions using
sigma-clipping (to reject outliers by @ref{Sigma clipping}, using the
@ref{Stacking operators}).
@example
-aststatistics cmd.fits -cMAG_F775W,F606W-F775W --histogram2d=image \
- --numbins=100 --greaterequal=22 --lessthan=30 \
- --numbins2=100 --greaterequal2=-1 --lessthan2=3 \
- --manualbinrange --output=cmd-2d-hist.fits
-@end example
+$ astnoisechisel jplus-e1.fits -ojplus-nc1.fits
+$ astnoisechisel jplus-e2.fits -ojplus-nc2.fits
+$ astnoisechisel jplus-e3.fits -ojplus-nc3.fits
-@noindent
-If you have SAO DS9, you can now open this FITS file as a normal FITS image,
for example, with the command below.
-Try hovering/zooming over the pixels: not only will you see the number of
objects in the UVUDF catalog that fall in each bin, but you also see the
@code{F775W} magnitude and color of that pixel also.
+$ astarithmetic jplus-nc*.fits 3 5 0.2 sigclip-mean \
+ -gINPUT-NO-SKY -ojplus-stack.fits
-@example
-$ ds9 cmd-2d-hist.fits -cmap sls -zoom to fit
+$ astscript-fits-view jplus-nc*.fits jplus-stack.fits
@end example
@noindent
-With the first command below, you can activate the grid feature of DS9 to
actually see the coordinate grid, as well as values on each line.
-With the second command, DS9 will even read the labels of the axes and use
them to generate an almost publication-ready plot.
+After opening the individual exposures and the final stack with the last
command, take the following steps to see the comparisons properly:
+@enumerate
+@item
+Click on the stack image so it is selected.
+@item
+Go to the ``Frame'' menu, then the ``Lock'' item, then activate ``Scale and
Limits''.
+@item
+Scroll your mouse or touchpad to zoom into the image.
+@end enumerate
+
+@noindent
+You clearly see that the stacked image is deeper and that there is no Moir@'e
pattern, while you have slightly @emph{improved} the spatial resolution of the
output compared to the input.
+In case you want the stack to have the original pixel resolution, you just
need one more warp:
@example
-$ ds9 cmd-2d-hist.fits -cmap sls -zoom to fit -grid yes
-$ ds9 cmd-2d-hist.fits -cmap sls -zoom to fit -grid yes \
- -grid type publication
+$ astwarp jplus-stack.fits --cdelt=$cdelt -ojplus-stack-origres.fits
@end example
-If you are happy with the grid, coloring and the rest, you can also use ds9 to
save this as a JPEG image to directly use in your documents/slides with these
extra DS9 options (DS9 will write the image to @file{cmd-2d.jpeg} and quit
immediately afterwards):
+For optimal results, the oversampling should be determined by the dithering
pattern of the observation:
+For example if you only have two dither points, you want the pixels with
maximum value in the @code{MAX-FRAC} image of one exposure to fall on those
with a minimum value in the other exposure.
+Ideally, many more dither points should be chosen when you are planning your
observation (not just for the Moir@'e pattern, but also for all the other
reasons mentioned above).
+Based on the dithering pattern, you want to select the increased resolution
such that the maximum @code{MAX-FRAC} values fall on every different pixel of
the output grid for each exposure.
-@example
-$ ds9 cmd-2d-hist.fits -cmap sls -zoom 4 -grid yes \
- -grid type publication -saveimage cmd-2d.jpeg -quit
-@end example
+@node Invoking astwarp, , Moire pattern and its correction, Warp
+@subsection Invoking Warp
-@cindex PGFPlots (@LaTeX{} package)
-This is good for a fast progress update.
-But for your paper or more official report, you want to show something with
higher quality.
-For that, you can use the PGFPlots package in @LaTeX{} to add axes in the same
font as your text, sharp grids and many other elegant/powerful features (like
over-plotting interesting points and lines).
-But to load the 2D histogram into PGFPlots first you need to convert the FITS
image into a more standard format, for example, PDF.
-We will use Gnuastro's @ref{ConvertType} for this, and use the
@code{sls-inverse} color map (which will map the pixels with a value of zero to
white):
+Warp will warp an input image into a new pixel grid by pixel mixing (see
@ref{Resampling}).
+Without any options, Warp will remove any non-linear distortions from the
image and align the output pixel coordinates to its WCS coordinates.
+Any homographic warp (for example, scaling, rotation, translation, projection,
see @ref{Linear warping basics}) can also be done by calling the relevant
option explicitly.
+The general template for invoking Warp is:
@example
-$ astconvertt cmd-2d-hist.fits --colormap=sls-inverse \
- --borderwidth=0 -ocmd-2d-hist.pdf
+$ astwarp [OPTIONS...] InputImage
@end example
@noindent
-Below you can see a minimally working example of how to add axis numbers,
labels and a grid to the PDF generated above.
-Copy and paste the @LaTeX{} code below into a plain-text file called
@file{cmd-report.tex}
-Notice the @code{xmin}, @code{xmax}, @code{ymin}, @code{ymax} values and how
they are the same as the range specified above.
+One line examples:
@example
-\documentclass@{article@}
-\usepackage@{pgfplots@}
-\dimendef\prevdepth=0
-\begin@{document@}
+## Align image with celestial coordinates and remove any distortion
+$ astwarp image.fits
-You can write all you want here...
+## Align four exposures to same pixel grid and stack them with
+## Arithmetic program's sigma-clipped mean operator (out of many
+## stacking operators, see Arithmetic's documentation).
+$ grid="--center=1.234,5.678 --widthinpix=1001,1001 --cdelt=0.2/3600"
+$ astwarp a.fits $grid --output=A.fits
+$ astwarp b.fits $grid --output=B.fits
+$ astwarp c.fits $grid --output=C.fits
+$ astwarp d.fits $grid --output=D.fits
+$ astarithmetic A.fits B.fits C.fits D.fits 4 5 0.2 sigclip-mean \
+ -g1 --output=stack.fits
-\begin@{tikzpicture@}
- \begin@{axis@}[
- enlargelimits=false,
- grid,
- axis on top,
- width=\linewidth,
- height=\linewidth,
- xlabel=@{Magnitude (F775W)@},
- ylabel=@{Color (F606W-F775W)@}]
+## Warp a previously created mock image to the same pixel grid as the
+## real image (including any distortions).
+$ astwarp mock.fits --gridfile=real.fits
- \addplot graphics[xmin=22, xmax=30, ymin=-1, ymax=3]
- @{cmd-2d-hist.pdf@};
- \end@{axis@}
-\end@{tikzpicture@}
-\end@{document@}
-@end example
+## Rotate and then scale input image:
+$ astwarp --rotate=37.92 --scale=0.8 image.fits
-@noindent
-Run this command to build your PDF (assuming you have @LaTeX{} and PGFPlots).
+## Scale, then translate the input image:
+$ astwarp --scale 8/3 --translate 2.1 image.fits
-@example
-$ pdflatex cmd-report.tex
+## Directly input a custom warping matrix (using fraction):
+$ astwarp --matrix=1/5,0,4/10,0,1/5,4/10,0,0,1 image.fits
+
+## Directly input a custom warping matrix, with final numbers:
+$ astwarp --matrix="0.7071,-0.7071, 0.7071,0.7071" image.fits
@end example
-The improved quality, blending in with the text, vector-graphics resolution
and other features make this plot pleasing to the eye, and let your readers
focus on the main point of your scientific argument.
-PGFPlots can also built the PDF of the plot separately from the rest of the
paper/report, see @ref{2D histogram as a table for plotting} for the necessary
changes in the preamble.
+If any processing is to be done, Warp needs to be given a 2D FITS image.
+As in all Gnuastro programs, when an output is not explicitly set with the
@option{--output} option, the output filename will be set automatically based
on the operation, see @ref{Automatic output}.
+For the full list of general options to all Gnuastro programs (including
Warp), please see @ref{Common options}.
-@node Sigma clipping, Least squares fitting, 2D Histograms, Statistics
-@subsection Sigma clipping
+Warp uses pixel mixing to derive the pixel values of the output image, see
@ref{Resampling}.
+To be the most accurate, the input image will be read as a 64-bit double
precision floating point dataset and all internal processing is done in this
format.
+Upon writing, by default it will be converted to 32-bit single precision
floating point type (actual observational data rarely have such precision!).
+In case you want a different output type, you can use the @option{--type}
option that is common to several Gnuastro programs.
+For example, if your input is a mock image without noise, and you want to
preserve the 64-bit precision, use (with @option{--type=float64}.
+Just note that the file size will also be double!
+For more on the precision of various types, see @ref{Numeric data types}.
-Let's assume that you have pure noise (centered on zero) with a clear
@url{https://en.wikipedia.org/wiki/Normal_distribution,Gaussian distribution},
or see @ref{Photon counting noise}.
-Now let's assume you add very bright objects (signal) on the image which have
a very sharp boundary.
-By a sharp boundary, we mean that there is a clear cutoff (from the noise) at
the pixels the objects finish.
-In other words, at their boundaries, the objects do not fade away into the
noise.
-In such a case, when you plot the histogram (see @ref{Histogram and Cumulative
Frequency Plot}) of the distribution, the pixels relating to those objects will
be clearly separate from pixels that belong to parts of the image that did not
have any signal (were just noise).
-In the cumulative frequency plot, after a steady rise (due to the noise), you
would observe a long flat region were for a certain range of data (horizontal
axis), there is no increase in the index (vertical axis).
+By default (if no linear operation is requested), Warp will align the pixel
grid of the input image to the WCS coordinates it contains.
+This operation and the the options that govern it are described in @ref{Align
pixels with WCS considering distortions}.
+You can Warp an input image to the same pixel grid as a reference FITS file
using the @option{--wcsfile} option.
+In this case, the output image will take all the information needed from the
reference WCS file and HDU/extension specified with @option{--wcshdu}, thus it
will discard any other resampling options given.
-@cindex Blurring
-@cindex Cosmic rays
-@cindex Aperture blurring
-@cindex Atmosphere blurring
-Outliers like the example above can significantly bias the measurement of
noise statistics.
-@mymath{\sigma}-clipping is defined as a way to avoid the effect of such
outliers.
-In astronomical applications, cosmic rays (when they collide at a near normal
incidence angle) are a very good example of such outliers.
-The tracks they leave behind in the image are perfectly immune to the blurring
caused by the atmosphere and the aperture.
-They are also very energetic and so their borders are usually clearly
separated from the surrounding noise.
-So @mymath{\sigma}-clipping is very useful in removing their effect on the
data.
-See Figure 15 in Akhlaghi and Ichikawa,
@url{https://arxiv.org/abs/1505.01664,2015}.
+If you need any custom linear warping (independent of the WCS, see @ref{Linear
warping basics}), you need to call the respective operation manually.
+These are described in @ref{Linear warps to be called explicitly}.
+Please note that you may not use both linear and non-linear modes
simultaneously.
+For example, you cannot scale or rotate the image while removing its
non-linear distortions at the same time.
-@mymath{\sigma}-clipping is defined as the very simple iteration below.
-In each iteration, the range of input data might decrease and so when the
outliers have the conditions above, the outliers will be removed through this
iteration.
-The exit criteria will be discussed below.
+The following options are shared between both modes:
-@enumerate
-@item
-Calculate the standard deviation (@mymath{\sigma}) and median (@mymath{m})
-of a distribution.
-@item
-Remove all points that are smaller or larger than
-@mymath{m\pm\alpha\sigma}.
-@item
-Go back to step 1, unless the selected exit criteria is reached.
-@end enumerate
+@table @option
+@item --hstartwcs=INT
+Specify the first header keyword number (line) that should be used to read the
WCS information, see the full explanation in @ref{Invoking astcrop}.
-@noindent
-The reason the median is used as a reference and not the mean is that the mean
is too significantly affected by the presence of outliers, while the median is
less affected, see @ref{Quantifying signal in a tile}.
-As you can tell from this algorithm, besides the condition above (that the
signal have clear high signal to noise boundaries) @mymath{\sigma}-clipping is
only useful when the signal does not cover more than half of the full data set.
-If they do, then the median will lie over the outliers and
@mymath{\sigma}-clipping might remove the pixels with no signal.
+@item --hendwcs=INT
+Specify the last header keyword number (line) that should be used to read the
WCS information, see the full explanation in @ref{Invoking astcrop}.
-There are commonly two exit criteria to stop the @mymath{\sigma}-clipping
-iteration:
+@item -C FLT
+@itemx --coveredfrac=FLT
+Depending on the warp, the output pixels that cover pixels on the edge of the
input image, or blank pixels in the input image, are not going to be fully
covered by input data.
+With this option, you can specify the acceptable covered fraction of such
pixels (any value between 0 and 1).
+If you only want output pixels that are fully covered by the input image area
(and are not blank), then you can set @option{--coveredfrac=1} (which is the
default!).
+Alternatively, a value of @code{0} will keep output pixels that are even
infinitesimally covered by the input.
+As a result, with @option{--coveredfrac=0}, the sum of the pixels in the input
and output images will be exactly the same.
+@end table
-@itemize
-@item
-When a certain number of iterations has taken place (second value to the
@option{--sclipparams} option is larger than 1).
-@item
-When the new measured standard deviation is within a certain tolerance level
of the old one (second value to the @option{--sclipparams} option is less than
1).
-The tolerance level is defined by:
+@menu
+* Align pixels with WCS considering distortions:: Default operation.
+* Linear warps to be called explicitly:: Other warps.
+@end menu
-@dispmath{\sigma_{old}-\sigma_{new} \over \sigma_{new}}
+@node Align pixels with WCS considering distortions, Linear warps to be called
explicitly, Invoking astwarp, Invoking astwarp
+@subsubsection Align pixels with WCS considering distortions
-The standard deviation is used because it is heavily influenced by the
presence of outliers.
-Therefore the fact that it stops changing between two iterations is a sign
that we have successfully removed outliers.
-Note that in each clipping, the dispersion in the distribution is either less
or equal.
-So @mymath{\sigma_{old}\geq\sigma_{new}}.
-@end itemize
+@cindex Resampling
+@cindex WCS distortion
+@cindex TPV distortion
+@cindex SIP distortion
+@cindex Non-linear distortion
+@cindex Align pixel and WCS coordinates
+When none of the linear warps@footnote{For linear warps, see @ref{Linear warps
to be called explicitly}.} are requested, Warp will align the input's pixel
axes with it's WCS axes.
+In the process, any possibly existing distortion is also removed (such as
@code{TPV} and @code{SIP}).
+Usually, the WCS axes are the Right Ascension and Declination in equatorial
coordinates.
+The output image's pixel grid is highly customizable through the options in
this section.
+To learn about Warp's strategy to build the new pixel grid, see
@ref{Resampling}.
+For strong distortions (that produce strong curvatures), you can fine-tune the
area-based resampling with @option{--edgesampling}, as described below.
+
+On the other hand, sometimes you need to Warp an input image to the exact same
grid of an already available reference FITS image with an existing WCS.
+If that image is already aligned, finding its center, number of pixels and
pixel scale can be annoying (and just increase the complexity of your script).
+On the other hand, if that image is not aligned (for example, has a certain
rotation in the sky, and has a different distortion), there are too many WCS
parameters to set (some are not yet available explicitly in the options here)!
+For such scenarios, Warp has the @option{--gridfile} option.
+When @option{--gridfile} is called, the options below that are used to define
the output's WCS will be ignored (these options: @option{--center},
@option{--widthinpix}, @option{--cdelt}, @option{--ctype}).
+In this case, the output's WCS and pixel grid will exactly match the image
given to @option{--gridfile} (including any rotation, pixel scale, or
distortion or projection).
@cartouche
@noindent
-When working on astronomical images, objects like galaxies and stars are
blurred by the atmosphere and the telescope aperture, therefore their signal
sinks into the noise very gradually.
-Galaxies in particular do not appear to have a clear high signal to noise
cutoff at all.
-Therefore @mymath{\sigma}-clipping will not be useful in removing their effect
on the data.
+@cindex Stacking
+@cindex Coaddition
+@strong{Set @option{--cdelt} explicitly when you plan to stack many warped
images:}
+To align some images and later stack them, it is necessary to be sure the
pixel sizes of all the images are the same exactly.
+Most of the time the measured (during astrometry) pixel scale of the separate
exposures, will be different in the second or third digit number after the
decimal point.
+It is a normal/statistical error in measuring the astrometry.
+On a large image, these slight differences can cause different output sizes
(of one or two pixels on a very large image).
-To gauge if @mymath{\sigma}-clipping will be useful for your dataset, look at
the histogram (see @ref{Histogram and Cumulative Frequency Plot}).
-The ASCII histogram that is printed on the command-line with
@option{--asciihist} is good enough in most cases.
+You can fix this by explicitly setting the pixel scale of each warped exposure
with Warp's @option{--cdelt} option that is described below.
+For good strategies of setting the pixel scale, see @ref{Moire pattern and its
correction}.
@end cartouche
+Another problem that may arise when aligning images to new pixel grids is the
aliasing or visible Moir@'e patterns on the output image.
+This artifact should be removed if you are stacking several exposures,
especially with a dithering pattern.
+If not see @ref{Moire pattern and its correction} for ways to mitigate the
visible patterns.
+See the description of @option{--gridfile} below for more.
-@node Least squares fitting, Sky value, Sigma clipping, Statistics
-@subsection Least squares fitting
-
-@cindex Radial profile
-@cindex Least squares fitting
-@cindex Fitting (least squares)
-@cindex Star formation main sequence
-After completing a good observation, doing robust data reduction and
finalizing the measurements, it is commonly necessary to parameterize the
derived correlations.
-For example, you have derived the radial profile of the PSF of your image (see
@ref{Building the extended PSF}).
-You now want to parameterize the radial profile to estimate the slope.
-Alternatively, you may have found the star formation rate and stellar mass of
your sample of galaxies.
-Now, you want to derive the star formation main sequence as a parametric
relation between the two.
-The fitting functions below can be used for such purposes.
-
-@cindex GSL
-@cindex GNU Scientific Library
-Gnuastro's least squares fitting features are just wrappers over the least
squares fitting methods of the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html, linear} and
@url{https://www.gnu.org/software/gsl/doc/html/nls.html, nonlinear}
least-squares fitting functions of the GNU Scientific Library (GSL).
-For the low-level details and equations of the methods, please see the GSL
documentation.
-The names have been preserved here in Gnuastro to simplify the connection with
GSL and follow the details in the detailed documentation there.
+@cartouche
+@noindent
+@cindex WCSLIB
+@strong{Known issue:} Warp's WCS-based aligning works best with WCSLIB version
7.12 (released in September 2022) and above.
+If you have an older version of WCSLIB, you might get a @code{wcss2p} error
otherwise.
+@end cartouche
-GSL is a very low-level library, designed for maximal portability to many
scenarios, and power.
-Therefore calling GSL's functions directly for a fast operation requires a
good knowledge of the C programming language and many lines of code.
-As a low-level library, GSL is designed to be the back-end of higher-level
programs (like Gnuastro).
-Through the Statistics program, in Gnuastro we provide a high-level interface
to access to GSL's very powerful least squares fitting engine to read/write
from/to standard data formats in astronomy.
-A fully working example is shown below.
+@table @option
+@item -c FLT,FLT
+@itemx --center=FLT,FLT
+@cindex CRVALi
+@cindex Aligning an image
+WCS coordinates of the center of the central pixel of the output image.
+Since a central pixel is only defined with an odd number of pixels along both
dimensions, the output will always have an odd number of pixels.
+When @option{--center} or @option{--gridfile} aren't given, the output will
have the same central WCS coordinate as the input.
-@cindex Gaussian noise
-@cindex Noise (Gaussian)
-To activate fitting in Statistics, simply give your desired fitting method to
the @option{--fit} option (for the full list of acceptable methods, see
@ref{Fitting options}).
-For example, with the command below, we'll build a fake measurement table
(including noise) from the polynomial @mymath{y=1.23-4.56x+7.89x^2}.
-To understand how this equation translates to the command below (part before
@code{set-y}), see @ref{Reverse polish notation} and @ref{Column arithmetic}.
-We will set the X axis to have values from 0.1 to 2, with steps of 0.01 and
let's assume a random Gaussian noise to each @mymath{y} measurement:
@mymath{\sigma_y=0.1y}.
-To make the random number generation exactly reproducible, we are also setting
the seed (see @ref{Generating random numbers}, which also uses GSL as a
backend).
-To learn more about the @code{mknoise-sigma} operator, see the Arithmetic
program's @ref{Random number generators}.
+Usually, the WCS coordinates are Right Ascension and Declination (when the
first three characters of @code{CTYPE1} and @code{CTYPE2} are respectively
@code{RA-} and @code{DEC}).
+For more on the @code{CTYPEi} keyword values, see @code{--ctype} below.
-@example
-$ export GSL_RNG_SEED=1664015492
-$ seq 0.1 0.01 2 \
- | asttable --output=noisy.fits --envseed -c1 \
- -c'arith 1.23 -4.56 $1 x + 7.89 $1 x $1 x + set-y \
- 0.1 y x set-yerr \
- y yerr mknoise-sigma yerr' \
- --colmetadata=1,X --colmetadata=2,Y \
- --colmetadata=3,Yerr
-@end example
+@item -w INT[,INT]
+@itemx --width=INT[,INT]
+Width and height of the output image in units of WCS (usually degrees).
+If you want the values to be read as pixels, also call the
@option{--widthinpix} option with @option{--width}.
+If a single value is given, Warp will use the same value for the second
dimension (creating a square output).
+When @option{--width} or @option{--gridfile} aren't given, Warp will calculate
the necessary size of the output pixel grid to fully contain the input image.
-@noindent
-Let's have a look at the output plot with TOPCAT using the command below.
+Usually the WCS coordinates are in units of degrees (defined by the
@code{CUNITi} keywords of the FITS standard).
+But entering a certain number of arcseconds or arcminutes for the width can be
annoying (you will usually need to go to the calculator!).
+To simplify such situations, this option also accepts division.
+For example @option{--width=1/60,2/60} will make an aligned warp that is 1
arcmin along Right Ascension and 2 arcminutes along the Declination.
-@example
-$ astscript-fits-view noisy.fits
-@end example
+With the @option{--widthinpix} option the values will be interpreted as
numbers of pixels.
+In this scenario, this option should be given @emph{odd} integer(s) that are
greater than 1.
+This ensures that the output image can have a @emph{central} pixel.
+Recall that through the @option{--center} option, you specify the WCS
coordinate of the center of the central pixel.
+The central coordinate of an image with an even number of pixels will be on
the edge of two pixels, so a ``central'' pixel is not well defined.
+If any of the given values are even, Warp will automatically add a single
pixel (to make it an odd integer) and print a warning message.
-@noindent
-To see the error-bars, after opening the scatter plot, go into the ``Form''
tab for that plot.
-Click on the button with a green ``+'' sign followed by ``Forms'' and select
``XYError''.
-On the side-menu, in front of ``Y Positive Error'', select the @code{Yerr}
column of the input table.
+@item --widthinpix
+When called, the values given to the @option{--width} option will be
interpreted as the number of pixels along each dimension(s).
+See the description of @option{--width} for more.
-As you see, the error bars do indeed increase for higher X axis values.
-Since we have error bars in this example (as in any measurement), we can use
weighted fitting.
-Also, this isn't a linear relation, so we'll use a polynomial to second order
(a maximum power of 2 in the form of @mymath{Y=c_0+c_1X+c_2X^2}):
+@item -x FLT[,FLT]
+@itemx --cdelt=FLT[,FLT]
+@cindex CDELTi
+@cindex Pixel scale
+Coordinate deltas or increments (@code{CDELTi} in the FITS standard), or the
pixel scale in both dimensions.
+If a single value is given, it will be used for both axes.
+In this way, the output's pixels will be squares on the sky at the reference
point (as is usually expected!).
+When @option{--cdelt} or @option{--gridfile} aren't given, Warp will read the
input's pixel scale and choose the larger of @code{CDELT1} or @code{CDELT2} so
the output pixels are square.
-@example
-$ aststatistics noisy.fits -cX,Y,Yerr --fit=polynomial-weighted \
- --fitmaxpower=2
-Statistics (GNU Astronomy Utilities) @value{VERSION}
--------
-Fitting results (remove extra info with '--quiet' or '-q)
- Input file: noisy.fits (hdu: 1) with 191 non-blank rows.
- X column: X
- Y column: Y
- Weight column: Yerr [Standard deviation of Y in each row]
+Usually (when dealing with RA and Dec, and the @code{CUNITi}s have a value of
@code{deg}), the units of the given values are degrees/pixel.
+Warp allows you to easily convert from @emph{arcsec} to @emph{degrees} by
simply appending a @code{/3600} to the value.
+For example, for an output image of pixel scale @code{0.27} arcsec/pixel, you
can use @code{--cdelt=0.27/3600}.
-Fit function: Y = c0 + (c1 * X^1) + (c2 * X^2) + ... (cN * X^N)
- N: 2
- c0: +1.2286211608
- c1: -4.5127796636
- c2: +7.8435883943
-Covariance matrix:
- +0.0010496001 -0.0039928488 +0.0028367390
- -0.0039928488 +0.0175244127 -0.0138030778
- +0.0028367390 -0.0138030778 +0.0128129806
+@item --ctype=STR,STR
+@cindex Align
+@cindex CTYPEi
+@cindex Resampling
+The coordinate types of the output (@code{CTYPE1} and @code{CTYPE2} keywords
in the FITS standard), separated by a comma.
+By default the value to this option is `@code{RA---TAN,DEC--TAN}'.
+However, if @option{--gridfile} is given, this option is ignored.
-Reduced chi^2 of fit:
- +0.9740670090
-@end example
+If you don't call @option{--ctype} or @option{--gridfile}, the output WCS
coordinates will be Right Ascension and Declination, while the output's
projection will be
@url{https://en.wikipedia.org/wiki/Gnomonic_projection,Gnomonic}, also known as
Tangential (TAN).
+This combination is the most common in extra-galactic imaging surveys.
+For other coordinates and projections in your output use other values, as
described below.
-As you see from the elaborate message, the weighted polynomial fitting has
found return the @mymath{c_0}, @mymath{c_1} and @mymath{c_2} of
@mymath{Y=c_0+c_1X+c_2X^2} that best represents the data we inserted.
-Our input values were @mymath{c_0=1.23}, @mymath{c_1=-4.56} and
@mymath{c_2=7.89}, and the fitted values are @mymath{c_0\approx1.2286},
@mymath{c_1\approx-4.5128} and @mymath{c_2\approx7.8436} (which is
statistically a very good fit! given that we knew the original values
a-priori!).
-The covariance matrix is also calculated, it is necessary to calculate error
bars on the estimations and contains a lot of information (e.g., possible
correlations between parameters).
-Finally, the reduced @mymath{\chi^2} (or @mymath{\chi_{red}^2}) of the fit is
also printed (which was the measure to minimize).
-A @mymath{\chi_{red}^2\approx1} shows a good fit.
-This is good for real-world scenarios when you don't know the original values
a-priori.
-For more on interpreting @mymath{\chi_{red}^2\approx1}, see
@url{https://arxiv.org/abs/1012.3754, Andrae et al (2010)}.
+According to the FITS standard version 4.0@footnote{FITS standard version 4.0:
@url{https://fits.gsfc.nasa.gov/standard40/fits_standard40aa-le.pdf}}:
@code{CTYPEi} is the
+``type for the Intermediate-coordinate Axis @mymath{i}.
+Any coordinate type that is not covered by this Standard or an officially
recognized FITS convention shall be taken to be linear.
+All non-linear coordinate system names must be expressed in `4–3' form: the
first four characters specify the coordinate type, the fifth character is a
hyphen (@code{-}), and the remaining three characters specify an algorithm code
for computing the world coordinate value.
+Coordinate types with names of fewer than four characters are padded on the
right with hyphens, and algorithm codes with fewer than three characters are
padded on the right with SPACE.
+Algorithm codes should be three characters''
+(see list of algorithm codes below).
-The comparison of fitted and input values look pretty good, but nothing beats
visual inspection!
-To see how this looks compared to the data, let's open the table again:
+@cindex WCS Projections
+@cindex Projections (world coordinate system)
+You can use any of the projection algorithms (last three characters of each
coordinate's type) provided by your host WCSLIB (a mandatory dependency of
Gnuastro; see @ref{WCSLIB}).
+For a very elaborate and complete description of projection algorithms in the
FITS WCS standard, see @url{https://doi.org/10.1051/0004-6361:20021327,
Calabretta and Greisen 2002}.
+Wikipedia also has a nice article on
@url{https://en.wikipedia.org/wiki/Map_projection, Map projections}.
+As an example, WCSLIB 7.12 (released in September 2022) has the following
projection algorithms:
-@example
-$ astscript-fits-view noisy.fits
-@end example
+@table @code
+@item AZP
+@cindex Zenithal/azimuthal projection
+Zenithal/azimuthal perspective
+@item SZP
+@cindex Slant zenithal projection
+Slant zenithal perspective
+@item TAN
+@cindex Gnomonic (tangential) projection
+Gnomonic (tangential)
+@item STG
+@cindex Stereographic projection
+Stereographic
+@item SIN
+@cindex Orthographic/synthesis projection
+Orthographic/synthesis
+@item ARC
+@cindex Zenithal/azimuthal equidistant projection
+Zenithal/azimuthal equidistant
+@item ZPN
+@cindex Zenithal/azimuthal polynomial projection
+Zenithal/azimuthal polynomial
+@item ZEA
+@cindex Zenithal/azimuthal equal area projection
+Zenithal/azimuthal equal area
+@item AIR
+@cindex Airy projection
+Airy
+@item CYP
+@cindex Cylindrical perspective projection
+Cylindrical perspective
+@item CEA
+@cindex Cylindrical equal area projection
+Cylindrical equal area
+@item CAR
+@cindex Plate carree projection
+Plate carree
+@item MER
+@cindex Mercator projection
+Mercator
+@item SFL
+@cindex Sanson-Flamsteed projection
+Sanson-Flamsteed
+@item PAR
+@cindex Parabolic projection
+Parabolic
+@item MOL
+@cindex Mollweide projection
+Mollweide
+@item AIT
+@cindex Hammer-Aitoff projection
+Hammer-Aitoff
+@item COP
+@cindex Conic perspective projection
+Conic perspective
+@item COE
+@cindex Conic equal area projection
+Conic equal area
+@item COD
+@cindex Conic equidistant projection
+Conic equidistant
+@item COO
+@cindex Conic orthomorphic projection
+Conic orthomorphic
+@item BON
+@cindex Bonne projection
+Bonne
+@item PCO
+@cindex Polyconic projection
+Polyconic
+@item TSC
+@cindex Tangential spherical cube projection
+Tangential spherical cube
+@item CSC
+@cindex COBE spherical cube projection
+COBE spherical cube
+@item QSC
+@cindex Quadrilateralized spherical cube projection
+Quadrilateralized spherical cube
+@item HPX
+@cindex HEALPix projection
+HEALPix
+@item XPH
+@cindex Butterfly projection
+@cindex HEALPix polar projection
+HEALPix polar, aka "butterfly"
+@end table
-Repeat the steps below to show the scatter plot and error-bars.
-Then, go to the ``Layers'' menu and select ``Add Function Control''.
-Use the results above to fill the box in front of ``Function Expression'':
@code{1.2286+(-4.5128*x)+(7.8436*x*x)}.
-You will see that the second order polynomial falls very nicely over the
points@footnote{After plotting, you will notice that the legend made the plot
too thin.
-Fortunately you have a lot of empty space within the plot.
-To bring the legend in, click on the ``Legend'' item on the bottom-left menu,
in the ``Location'' tab, click on ``Internal'' and hold and move it to the
top-left in the box below.
-To make the functional fit more clear, you can click on the ``Function'' item
of the bottom-left menu.
-In the ``Style'' tab, change the color and thickness.}.
-But this fit is not perfect: it also has errors (inherited from the
measurement errors).
-We need the covariance matrix to estimate the errors on each point, and that
can be complex to do by hand.
+@item -G
+@itemx --gridfile
+FITS filename containing the final pixel grid and WCS for the output image.
+The HDU/extension containing should be specified with @option{--gridhdu} or
its short option @option{-H}.
+The HDU should contain a WCS, otherwise, Warp will abort with a crash.
+When this option is used, Warp will read the respective WCS and the size of
the image to resample the input.
+Since this WCS of this HDU contains everything needed to construct the WCS the
options above will be ignored when @option{--gridfile} is called:
@option{--cdelt}, @option{--center}, and @option{--widthinpix}.
-Fortunately GSL has the tools to easily estimate the function at any point and
also calculate its corresponding error.
-To access this feature within Gnuastro's Statistics program, you should use
the @option{--fitestimate} option.
-You can either give an independent table file name (with
@option{--fitestimatehdu} and @option{--fitestimatecol} to specify the HDU and
column in that file), or just @code{self} so it uses the same X axis column
that was used in this fit.
-Let's use the easier case:
+In the example below, let's use this option to put the image of M51 in one
survey (J-PLUS) into the pixel grid of another survey (SDSS) containing M51.
+The J-PLUS field of view is very large (almost @mymath{1.5\times1.5}
deg@mymath{^2}, in @mymath{9500\times9500} pixels), while the field of view of
SDSS in each filter is small (almost @mymath{0.3\times0.25} deg@mymath{^2} in
@mymath{2048\times1489} pixels).
+With the first two commands, we'll first download the two images, then we'll
extract the portion of the J-PLUS image that overlaps with the SDSS image and
align it exactly to SDSS's pixel grid.
+Note that these are the two images that were used in two of Gnuastro's
tutorials: @ref{Building the extended PSF} and @ref{Detecting large extended
targets}.
@example
-$ aststatistics noisy.fits -cX,Y,Yerr --fit=polynomial-weighted \
- --fitmaxpower=2 --fitestimate=self --output=est.fits
-
-...[[truncated; same as above]]...
+## Download the J-PLUS DR2 image of M51 in the r filter.
+$ jplusbase="http://archive.cefca.es/catalogues/vo/siap"
+$ wget $jplusbase/jplus-dr2/get_fits?id=67510 \
+ -O jplus.fits.fz
-Requested estimation:
- Written to: est.fits
-@end example
+## Download the SDSS image in r filter and decompress it
+## (Bzip2 is not a standard FITS compression algorithm).
+$ sdssbase=https://dr12.sdss.org/sas/dr12/boss/photoObj/frames
+$ wget $sdssbase/301/3716/6/frame-r-003716-6-0117.fits.bz2 \
+ -O sdss.fits.bz2
+$ bunzip2 sdss.fits.bz2
-The first lines of the printed text are the same as before.
-Afterwards, you will see a new line printed in the output, saying that the
estimation was written in @file{est.fits}.
-You can now inspect the two tables with TOPCAT again with the command below.
-After TOPCAT opens, plot both scatter plots:
+## Warp and crop the J-PLUS image so the output exactly
+## matches the SDSS pixel gid.
+$ astwarp jplus.fits.fz --gridfile=sdss.fits --gridhdu=0 \
+ --output=jplus-on-sdss.fits
-@example
-$ astscript-fits-view noisy.fits est.fits
+## View the two images side-by-side:
+$ astscript-fits-view sdss.fits jplus-on-sdss.fits
@end example
-It is clear that they fall nicely on top of each other.
-The @file{est.fits} table also has a third column with error bars.
-You can follow the same steps before and draw the error bars to see how they
compare with the scatter of the measured data.
-They are much smaller than the error in each point because we had a very good
sampling of the function in our noisy data.
-
-Another useful point with the estimated output file is that it contains all
the fitting outputs as keywords in the header:
+As the example above shows, this option can therefore be very useful when
comparing images from multiple surveys.
+But there are other very interesting use cases also.
+For example, when you are making a mock dataset and need to add distortion to
the image so it matches the distortion of your camera.
+Through @option{--gridhdu}, you can easily insert that distortion over the
mock image and put the mock image in the pixel grid of an exposure.
-@example
-$ astfits est.fits -h1
-...[[truncated]]...
+@item -H
+@itemx --gridhdu
+The HDU/extension of the reference WCS file specified with option
@option{--wcsfile} or its short version @option{-H} (see the description of
@option{--wcsfile} for more).
- / Fit results
-FITTYPE = 'polynomial-weighted' / Functional form of the fitting.
-FITMAXP = 2 / Maximum power of polynomial.
-FITIN = 'noisy.fits' / Name of file with input columns.
-FITINHDU= '1 ' / Name or Number of HDU with input cols.
-FITXCOL = 'X ' / Name or Number of independent (X) col.
-FITYCOL = 'Y ' / Name or Number of measured (Y) column.
-FITWCOL = 'Yerr ' / Name or Number of weight column.
-FITWNAT = 'Standard deviation' / Nature of weight column.
-FRDCHISQ= 0.974067008958516 / Reduced chi^2 of fit.
-FITC0 = 1.22862116084727 / C0: multiple of x^0 in polynomial
-FITC1 = -4.51277966356177 / C1: multiple of x^1 in polynomial
-FITC2 = 7.84358839431161 / C2: multiple of x^2 in polynomial
-FCOV11 = 0.00104960011629718 / Covariance matrix element (1,1).
-FCOV12 = -0.00399284880859776 / Covariance matrix element (1,2).
-FCOV13 = 0.00283673901863388 / Covariance matrix element (1,3).
-FCOV21 = -0.00399284880859776 / Covariance matrix element (2,1).
-FCOV22 = 0.0175244126670659 / Covariance matrix element (2,2).
-FCOV23 = -0.0138030778380786 / Covariance matrix element (2,3).
-FCOV31 = 0.00283673901863388 / Covariance matrix element (3,1).
-FCOV32 = -0.0138030778380786 / Covariance matrix element (3,2).
-FCOV33 = 0.0128129806394559 / Covariance matrix element (3,3).
+@item --edgesampling=INT
+Number of extra samplings along the edge of a pixel.
+By default the value is @code{0} (the output pixel's polygon over the input
will be a quadrilateral (a polygon with four edges/vertices).
-...[[truncated]]...
-@end example
+Warp uses pixel mixing to derive the output pixel values.
+For a complete introduction, see @ref{Resampling}, and in particular its later
part on distortions.
+To account for this possible curvature due to distortion, you can use this
option.
+For example, @option{--edgesampling=1} will add one extra vertice in the
middle of each edge of the output pixel, producing an 8-vertice polygon.
+Similarly, @option{--edgesampling=5} will put 5 extra vertices along each
edge, thus sampling the shape (and possible curvature) of the output pixel over
an input pixel with @mymath{4+5\times4=24} vertice polygon.
+Since the polygon clipping will happen for every output pixel, a higher value
to this option can significantly reduce the running speed and increase the RAM
usage of Warp; so use it with caution: in most cases the default
@option{--edgesampling=0} is sufficient.
-In scenarios were you don't want the estimation, but only the fitted
parameters, all that verbose, human-friendly text or FITS keywords can be an
annoying extra step.
-For such cases, you should use the @option{--quiet} option like below.
-It will print the parameters, rows of the covariance matrix and
@mymath{\chi_{red}^2} on separate lines with nothing extra.
-This allows you to parse the values in any way that you would like.
+To visually inspect the curvature effect on pixel area of the input image, see
option @option{--pixelareaonwcs} in @ref{Pixel information images}.
-@example
-$ aststatistics noisy.fits -cX,Y,Yerr --fit=polynomial-weighted \
- --fitmaxpower=2 --quiet
-+1.2286211608 -4.5127796636 +7.8435883943
-+0.0010496001 -0.0039928488 +0.0028367390
--0.0039928488 +0.0175244127 -0.0138030778
-+0.0028367390 -0.0138030778 +0.0128129806
-+0.9740670090
-@end example
+@item --checkmaxfrac
+Check each output pixel's maximum coverage on the input data and append as the
`@code{MAX-FRAC}' HDU/extension to the output aligned image.
+This option provides an easy visual inspection for possible recurring patterns
or fringes caused by aligning to a new pixel grid.
+For more detail about the origin of these patterns and how to mitigate them
see @ref{Moire pattern and its correction}.
-As a final example, because real data usually have outliers, let's look at the
``robust'' polynomial fit which has special features to remove outliers.
-First, we need to add some outliers to the table.
-To do this, we'll make a plain-text table with @command{echo}, and use Table's
@option{--catrowfile} to concatenate (or append) those two rows to the original
table.
-Finally, we'll run the same fitting step above:
+Note that the `@code{MAX-FRAC}' HDU/extension is not showing the patterns
themselves;
+It represents the largest area coverage on the input data for that particular
pixel.
+The values can be in the range between 0 to 1, where 1 means the pixel is
covering at least one complete pixel of the input data.
+On the other hand, 0 means that the pixel is not covering any pixels of the
input at all.
+@end table
-@example
-$ echo "0.6 20 0.01" > outliers.txt
-$ echo "0.8 20 0.01" >> outliers.txt
-$ asttable noisy.fits --catrowfile=outliers.txt \
- --output=with-outlier.fits
-$ aststatistics with-outlier.fits -cX,Y,Yerr --fit=polynomial-weighted \
- --fitmaxpower=2 --fitestimate=self \
- --output=est-out.fits
-Statistics (GNU Astronomy Utilities) @value{VERSION}
--------
-Fitting results (remove extra info with '--quiet' or '-q)
- Input file: with-outlier.fits (hdu: 1) with 193 non-blank rows.
- X column: X
- Y column: Y
- Weight column: Yerr [Standard deviation of Y in each row]
+@node Linear warps to be called explicitly, , Align pixels with WCS
considering distortions, Invoking astwarp
+@subsubsection Linear warps to be called explicitly
-Fit function: Y = c0 + (c1 * X^1) + (c2 * X^2) + ... (cN * X^N)
- N: 2
- c0: -13.6446036899
- c1: +66.8463258547
- c2: -30.8746303591
+Linear warps include operations like rotation, scaling, sheer, etc.
+For an introduction, see @ref{Linear warping basics}.
+These are warps that don't depend on the WCS of the image and should be
explicitly requested.
+To align the input pixel coordinates with the WCS coordinates, see @ref{Align
pixels with WCS considering distortions}.
-Covariance matrix:
- +0.0007889160 -0.0027706310 +0.0022208939
- -0.0027706310 +0.0113922468 -0.0100306732
- +0.0022208939 -0.0100306732 +0.0094087226
+While they will correct any existing WCS based on the warp, they can also
operate on images without any WCS.
+For example, you have a mock image that doesn't (yet!) have its mock WCS, and
it has been created on an over-sampled grid and convolved with an over-sampled
PSF.
+In this scenario, you can use the @option{--scale} option to under-sample it
to your desired resolution.
+This is similar to the @ref{Sufi simulates a detection} tutorial.
-Reduced chi^2 of fit:
- +4501.8356719150
+Linear warps must be specified as command-line options, either as (possibly
multiple) modular warpings (for example, @option{--rotate}, or
@option{--scale}), or directly as a single raw matrix (with @option{--matrix}).
+If specified together, the latter (direct matrix) will take precedence and all
the modular warpings will be ignored.
+Any number of modular warpings can be specified on the command-line and
configuration files.
+If more than one modular warping is given, all will be merged to create one
warping matrix.
+As described in @ref{Merging multiple warpings}, matrix multiplication is not
commutative, so the order of specifying the modular warpings on the
command-line, and/or configuration files makes a difference (see
@ref{Configuration file precedence}).
+The full list of modular warpings and the other options particular to Warp are
described below.
-Requested estimation:
- Written to: est-out.fit
-@end example
+The values to the warping options (modular warpings as well as
@option{--matrix}), are a sequence of at least one number.
+Each number in this sequence is separated from the next by a comma (@key{,}).
+Each number can also be written as a single fraction (with a forward-slash
@key{/} between the numerator and denominator).
+Space and Tab characters are permitted between any two numbers, just do not
forget to quote the whole value.
+Otherwise, the value will not be fully passed onto the option.
+See the examples above as a demonstration.
-We see that the coefficient values have changed significantly and that
@mymath{\chi_{red}^2} has increased to @mymath{4501}!
-Recall that a good fit should have @mymath{\chi_{red}^2\approx1}.
-These numbers clearly show that the fit was bad, but again, nothing beats a
visual inspection.
-To visually see the effect of those outliers, let's plot them with the command
below.
-You see that those two points have clearly caused a turn in the fitted result
which is terrible.
+@cindex FITS standard
+Based on the FITS standard, integer values are assigned to the center of a
pixel and the coordinate [1.0, 1.0] is the center of the first pixel (bottom
left of the image when viewed in SAO DS9).
+So the coordinate center [0.0, 0.0] is half a pixel away (in each axis) from
the bottom left vertex of the first pixel.
+The resampling that is done in Warp (see @ref{Resampling}) is done on the
coordinate axes and thus directly depends on the coordinate center.
+In some situations this if fine, for example, when rotating/aligning a real
image, all the edge pixels will be similarly affected.
+But in other situations (for example, when scaling an over-sampled mock image
to its intended resolution, this is not desired: you want the center of the
coordinates to be on the corner of the pixel.
+In such cases, you can use the @option{--centeroncorner} option which will
shift the center by @mymath{0.5} before the main warp, then shift it back by
@mymath{-0.5} after the main warp.
-@example
-$ astscript-fits-view with-outlier.fits est-out.fits
-@end example
+@table @option
-For such cases, GSL has
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#robust-linear-regression,
Robust linear regression}.
-In Gnuastro's Statistics, you can access it with
@option{--fit=polynomial-robust}, like the example below.
-Just note that the robust method doesn't take an error column (because it
estimates the errors internally while rejecting outliers, based on the method).
+@item -r FLT
+@itemx --rotate=FLT
+Rotate the input image by the given angle in degrees: @mymath{\theta} in
@ref{Linear warping basics}.
+Note that commonly, the WCS structure of the image is set such that the RA is
the inverse of the image horizontal axis which increases towards the right in
the FITS standard and as viewed by SAO DS9.
+So the default center for rotation is on the right of the image.
+If you want to rotate about other points, you have to translate the warping
center first (with @option{--translate}) then apply your rotation and then
return the center back to the original position (with another call to
@option{--translate}, see @ref{Merging multiple warpings}.
-@example
-$ aststatistics with-outlier.fits -cX,Y --fit=polynomial-robust \
- --fitmaxpower=2 --fitestimate=self \
- --output=est-out.fits --quiet
+@item -s FLT[,FLT]
+@itemx --scale=FLT[,FLT]
+Scale the input image by the given factor(s): @mymath{M} and @mymath{N} in
@ref{Linear warping basics}.
+If only one value is given, then both image axes will be scaled with the given
value.
+When two values are given (separated by a comma), the first will be used to
scale the first axis and the second will be used for the second axis.
+If you only need to scale one axis, use @option{1} for the axis you do not
need to scale.
+The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
-$ astfits est-out.fits -h1 | grep ^FITC
-FITC0 = 1.20422691185238 / C0: multiple of x^0 in polynomial
-FITC1 = -4.4779253576348 / C1: multiple of x^1 in polynomial
-FITC2 = 7.84986153686548 / C2: multiple of x^2 in polynomial
+@item -f FLT[,FLT]
+@itemx --flip=FLT[,FLT]
+Flip the input image around the given axis(s).
+If only one value is given, then both image axes are flipped.
+When two values are given (separated by acomma), you can choose which axis to
flip over.
+@option{--flip} only takes values @code{0} (for no flip), or @code{1} (for a
flip).
+Hence, if you want to flip by the second axis only, use @option{--flip=0,1}.
-$ astscript-fits-view with-outlier.fits est-out.fits
-@end example
+@item -e FLT[,FLT]
+@itemx --shear=FLT[,FLT]
+Shear the input image by the given value(s): @mymath{A} and @mymath{B} in
@ref{Linear warping basics}.
+If only one value is given, then both image axes will be sheared with the
given value.
+When two values are given (separated by a comma), the first will be used to
shear the first axis and the second will be used for the second axis.
+If you only need to shear along one axis, use @option{0} for the axis that
must be untouched.
+The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
-It is clear that the coefficients are very similar to the no-outlier scenario
above and if you run the second command to view the scatter plots on TOPCAT,
you also see that the fit nicely follows the curve and is not affected by those
two points.
-GSL provides many methods to reject outliers.
-For their full list, see the description of @option{--fitrobust} in
@ref{Fitting options}.
-For a description of the outlier rejection methods, see the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#c.gsl_multifit_robust_workspace,
GSL manual}.
+@item -t FLT[,FLT]
+@itemx --translate=FLT[,FLT]
+Translate (move the center of coordinates) the input image by the given
value(s): @mymath{c} and @mymath{f} in @ref{Linear warping basics}.
+If only one value is given, then both image axes will be translated by the
given value.
+When two values are given (separated by a comma), the first will be used to
translate the first axis and the second will be used for the second axis.
+If you only need to translate along one axis, use @option{0} for the axis that
must be untouched.
+The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
-You may have noticed that unlike the cases before the last Statistics command
above didn't print anything on the standard output.
-This is becasue @option{--quiet} and @option{--fitestimate} were called
together.
-In this case, because all the fitting parameters are written as FITS keywords,
because of the @option{--quiet} option, they are no longer printed on standard
output.
+@item -p FLT[,FLT]
+@itemx --project=FLT[,FLT]
+Apply a projection to the input image by the given values(s): @mymath{g} and
@mymath{h} in @ref{Linear warping basics}.
+If only one value is given, then projection will apply to both axes with the
given value.
+When two values are given (separated by a comma), the first will be used to
project the first axis and the second will be used for the second axis.
+If you only need to project along one axis, use @option{0} for the axis that
must be untouched.
+The value(s) can also be written (on the command-line or in configuration
files) as a fraction.
-@node Sky value, Invoking aststatistics, Least squares fitting, Statistics
-@subsection Sky value
+@item -m STR
+@itemx --matrix=STR
+The warp/transformation matrix.
+All the elements in this matrix must be separated by commas(@key{,})
characters and as described above, you can also use fractions (a forward-slash
between two numbers).
+The transformation matrix can be either a 2 by 2 (4 numbers), or a 3 by 3 (9
numbers) array.
+In the former case (if a 2 by 2 matrix is given), then it is put into a 3 by 3
matrix (see @ref{Linear warping basics}).
-@cindex Sky
-One of the most important aspects of a dataset is its reference value: the
value of the dataset where there is no signal.
-Without knowing, and thus removing the effect of, this value it is impossible
to compare the derived results of many high-level analyses over the dataset
with other datasets (in the attempt to associate our results with the ``real''
world).
+@cindex NaN
+The determinant of the matrix has to be non-zero and it must not contain any
non-number values (for example, infinities or NaNs).
+The elements of the matrix have to be written row by row.
+So for the general Homography matrix of @ref{Linear warping basics}, it should
be called with @command{--matrix=a,b,c,d,e,f,g,h,1}.
-In astronomy, this reference value is known as the ``Sky'' value: the value
that noise fluctuates around: where there is no signal from detectable objects
or artifacts (for example, galaxies, stars, planets or comets, star spikes or
internal optical ghost).
-Depending on the dataset, the Sky value maybe a fixed value over the whole
dataset, or it may vary based on location.
-For an example of the latter case, see Figure 11 in
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa (2015)}.
+The raw matrix takes precedence over all the modular warping options listed
above, so if it is called with any number of modular warps, the latter are
ignored.
-Because of the significance of the Sky value in astronomical data analysis, we
have devoted this subsection to it for a thorough review.
-We start with a thorough discussion on its definition (@ref{Sky value
definition}).
-In the astronomical literature, researchers use a variety of methods to
estimate the Sky value, so in @ref{Sky value misconceptions}) we review those
and discuss their biases.
-From the definition of the Sky value, the most accurate way to estimate the
Sky value is to run a detection algorithm (for example, @ref{NoiseChisel}) over
the dataset and use the undetected pixels.
-However, there is also a more crude method that maybe useful when good direct
detection is not initially possible (for example, due to too many cosmic rays
in a shallow image).
-A more crude (but simpler method) that is usable in such situations is
discussed in @ref{Quantifying signal in a tile}.
+@item --centeroncorner
+Put the center of coordinates on the corner of the first (bottom-left when
viewed in SAO DS9) pixel.
+This option is applied after the final warping matrix has been finalized:
either through modular warpings or the raw matrix.
+See the explanation above for coordinates in the FITS standard to better
understand this option and when it should be used.
-@menu
-* Sky value definition:: Definition of the Sky/reference value.
-* Sky value misconceptions:: Wrong methods to estimate the Sky value.
-* Quantifying signal in a tile:: Method to estimate the presence of signal.
-@end menu
+@item -k
+@itemx --keepwcs
+@cindex WCSLIB
+@cindex World Coordinate System
+Do not correct the WCS information of the input image and save it untouched to
the output image.
+By default the WCS (World Coordinate System) information of the input image is
going to be corrected in the output image so the objects in the image are at
the same WCS coordinates.
+But in some cases it might be useful to keep it unchanged (for example, to
correct alignments).
+@end table
-@node Sky value definition, Sky value misconceptions, Sky value, Sky value
-@subsubsection Sky value definition
-@cindex Sky value
-This analysis is taken from @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa (2015)}.
-Let's assume that all instrument defects -- bias, dark and flat -- have been
corrected and the magnitude (see @ref{Brightness flux magnitude}) of a detected
object, @mymath{O}, is desired.
-The sources of flux on pixel@footnote{For this analysis the dimension of the
data (image) is irrelevant.
-So if the data is an image (2D) with width of @mymath{w} pixels, then a pixel
located on column @mymath{x} and row @mymath{y} (where all counting starts from
zero and (0, 0) is located on the bottom left corner of the image), would have
an index: @mymath{i=x+y\times{}w}.} @mymath{i} of the image can be written as
follows:
-@itemize
-@item
-Contribution from the target object (@mymath{O_i}).
-@item
-Contribution from other detected objects (@mymath{D_i}).
-@item
-Undetected objects or the fainter undetected regions of bright objects
(@mymath{U_i}).
-@item
-@cindex Cosmic rays
-A cosmic ray (@mymath{C_i}).
-@item
-@cindex Background flux
-The background flux, which is defined to be the count if none of the others
exists on that pixel (@mymath{B_i}).
-@end itemize
-@noindent
-The total flux in this pixel (@mymath{T_i}) can thus be written as:
-@dispmath{T_i=B_i+D_i+U_i+C_i+O_i.}
-@cindex Cosmic ray removal
-@noindent
-By definition, @mymath{D_i} is detected and it can be assumed that it is
correctly estimated (deblended) and subtracted, we can thus set @mymath{D_i=0}.
-There are also methods to detect and remove cosmic rays, for example, the
method described in van Dokkum (2001)@footnote{van Dokkum, P. G. (2001).
-Publications of the Astronomical Society of the Pacific. 113, 1420.}, or by
comparing multiple exposures.
-This allows us to set @mymath{C_i=0}.
-Note that in practice, @mymath{D_i} and @mymath{U_i} are correlated, because
they both directly depend on the detection algorithm and its input parameters.
-Also note that no detection or cosmic ray removal algorithm is perfect.
-With these limitations in mind, the observed Sky value for this pixel
(@mymath{S_i}) can be defined as
-@cindex Sky value
-@dispmath{S_i\equiv{}B_i+U_i.}
-@noindent
-Therefore, as the detection process (algorithm and input parameters) becomes
more accurate, or @mymath{U_i\to0}, the Sky value will tend to the background
value or @mymath{S_i\to B_i}.
-Hence, we see that while @mymath{B_i} is an inherent property of the data
(pixel in an image), @mymath{S_i} depends on the detection process.
-Over a group of pixels, for example, in an image or part of an image, this
equation translates to the average of undetected pixels
(Sky@mymath{=\sum{S_i}}).
-With this definition of Sky, the object flux in the data can be calculated,
per pixel, with
-@dispmath{ T_{i}=S_{i}+O_{i} \quad\rightarrow\quad
- O_{i}=T_{i}-S_{i}.}
-@cindex photo-electrons
-In the fainter outskirts of an object, a very small fraction of the
photo-electrons in a pixel actually belongs to objects, the rest is caused by
random factors (noise), see Figure 1b in @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa (2015)}.
-Therefore even a small over estimation of the Sky value will result in the
loss of a very large portion of most galaxies.
-Besides the lost area/brightness, this will also cause an over-estimation of
the Sky value and thus even more under-estimation of the object's magnitude.
-It is thus very important to detect the diffuse flux of a target, even if they
are not your primary target.
-In summary, the more accurately the Sky is measured, the more accurately the
magnitude (calculated from the sum of pixel values) of the target object can be
measured (photometry).
-Any under/over-estimation in the Sky will directly translate to an
over/under-estimation of the measured object's magnitude.
-@cartouche
-@noindent
-The @strong{Sky value} is only correctly found when all the detected
-objects (@mymath{D_i} and @mymath{C_i}) have been removed from the data.
-@end cartouche
-@node Sky value misconceptions, Quantifying signal in a tile, Sky value
definition, Sky value
-@subsubsection Sky value misconceptions
-As defined in @ref{Sky value}, the sky value is only accurately defined when
the detection algorithm is not significantly reliant on the sky value.
-In particular its detection threshold.
-However, most signal-based detection tools@footnote{According to Akhlaghi and
Ichikawa (2015), signal-based detection is a detection process that relies
heavily on assumptions about the to-be-detected objects.
-This method was the most heavily used technique prior to the introduction of
NoiseChisel in that paper.} use the sky value as a reference to define the
detection threshold.
-These older techniques therefore had to rely on approximations based on other
assumptions about the data.
-A review of those other techniques can be seen in Appendix A of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa (2015)}.
-These methods were extensively used in astronomical data analysis for several
decades, therefore they have given rise to a lot of misconceptions, ambiguities
and disagreements about the sky value and how to measure it.
-As a summary, the major methods used until now were an approximation of the
mode of the image pixel distribution and @mymath{\sigma}-clipping.
-@itemize
-@cindex Histogram
-@cindex Distribution mode
-@cindex Mode of a distribution
-@cindex Probability density function
-@item
-To find the mode of a distribution those methods would either have to assume
(or find) a certain probability density function (PDF) or use the histogram.
-But astronomical datasets can have any distribution, making it almost
impossible to define a generic function.
-Also, histogram-based results are very inaccurate (there is a large
dispersion) and it depends on the histogram bin-widths.
-Generally, the mode of a distribution also shifts as signal is added.
-Therefore, even if it is accurately measured, the mode is a biased measure for
the Sky value.
-@cindex Sigma-clipping
-@item
-Another approach was to iteratively clip the brightest pixels in the image
(which is known as @mymath{\sigma}-clipping).
-See @ref{Sigma clipping} for a complete explanation.
-@mymath{\sigma}-clipping is useful when there are clear outliers (an object
with a sharp edge in an image for example).
-However, real astronomical objects have diffuse and faint wings that penetrate
deeply into the noise, see Figure 1 in @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa (2015)}.
-@end itemize
-As discussed in @ref{Sky value}, the sky value can only be correctly defined
as the average of undetected pixels.
-Therefore all such approaches that try to approximate the sky value prior to
detection are ultimately poor approximations.
+@node Data analysis, Data modeling, Data manipulation, Top
+@chapter Data analysis
+Astronomical datasets (images or tables) contain very valuable information,
the tools in this section can help in analyzing, extracting, and quantifying
that information.
+For example, getting general or specific statistics of the dataset (with
@ref{Statistics}), detecting signal within a noisy dataset (with
@ref{NoiseChisel}), or creating a catalog from an input dataset (with
@ref{MakeCatalog}).
+@menu
+* Statistics:: Calculate dataset statistics.
+* NoiseChisel:: Detect objects in an image.
+* Segment:: Segment detections based on signal structure.
+* MakeCatalog:: Catalog from input and labeled images.
+* Match:: Match two datasets.
+@end menu
-@node Quantifying signal in a tile, , Sky value misconceptions, Sky value
-@subsubsection Quantifying signal in a tile
+@node Statistics, NoiseChisel, Data analysis, Data analysis
+@section Statistics
-In order to define detection thresholds on the image, or calibrate it for
measurements (subtract the signal of the background sky and define errors), we
need some basic measurements.
-For example, the quantile threshold in NoiseChisel (@option{--qthresh}
option), or the mean of the undetected regions (Sky) and the Sky standard
deviation (Sky STD) which are the output of NoiseChisel and Statistics.
-But astronomical images will contain a lot of stars and galaxies that will
bias those measurements if not properly accounted for.
-Quantifying where signal is present is thus a very important step in the usage
of a dataset; for example, if the Sky level is over-estimated, your target
object's magnitude will be under-estimated.
+The distribution of values in a dataset can provide valuable information about
it.
+For example, in an image, if it is a positively skewed distribution, we can
see that there is significant data in the image.
+If the distribution is roughly symmetric, we can tell that there is no
significant data in the image.
+In a table, when we need to select a sample of objects, it is important to
first get a general view of the whole sample.
-@cindex Data
-@cindex Noise
-@cindex Signal
-@cindex Gaussian distribution
-Let's start by clarifying some definitions:
-@emph{Signal} is defined as the non-random source of flux in each pixel (you
can think of this as the mean in a Gaussian or Poisson distribution).
-In astronomical images, signal is mostly photons coming of a star or galaxy,
and counted in each pixel.
-@emph{Noise} is defined as the random source of flux in each pixel (or the
standard deviation of a Gaussian or Poisson distribution).
-Noise is mainly due to counting errors in the detector electronics upon data
collection.
-@emph{Data} is defined as the combination of signal and noise (so a noisy
image of a galaxy is one @emph{data}set).
+On the other hand, you might need to know certain statistical parameters of
the dataset.
+For example, if we have run a detection algorithm on an image, and we want to
see how accurate it was, one method is to calculate the average of the
undetected pixels and see how reasonable it is (if detection is done correctly,
the average of undetected pixels should be approximately equal to the
background value, see @ref{Sky value}).
+In a table, you might have calculated the magnitudes of a certain class of
objects and want to get some general characteristics of the distribution
immediately on the command-line (very fast!), to possibly change some
parameters.
+The Statistics program is designed for such situations.
-When a dataset does not have any signal (for example, you take an image with a
closed shutter, producing an image that only contains noise), the mean, median
and mode of the distribution are equal within statistical errors.
-Signal from emitting objects, like astronomical targets, always has a positive
value and will never become negative, see Figure 1 in
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa (2015)}.
-Therefore, when signal is added to the data (you take an image with an open
shutter pointing to a galaxy for example), the mean, median and mode of the
dataset shift to the positive, creating a positively skewed distribution.
-The shift of the mean is the largest.
-The median shifts less, since it is defined after ordering all the
elements/pixels (the median is the value at a quantile of 0.5), thus it is not
affected by outliers.
-Finally, the mode's shift to the positive is the least.
+@menu
+* Histogram and Cumulative Frequency Plot:: Basic definitions.
+* 2D Histograms:: Plotting the distribution of two variables.
+* Sigma clipping:: Definition of @mymath{\sigma}-clipping.
+* Least squares fitting:: Fitting with various parametric functions.
+* Sky value:: Definition and derivation of the Sky value.
+* Invoking aststatistics:: Arguments and options to Statistics.
+@end menu
-@cindex Mean
-@cindex Median
-@cindex Quantile
-Inverting the argument above gives us a robust method to quantify the
significance of signal in a dataset: when the mean and median of a distribution
are approximately equal we can argue that there is no significant signal.
-In other words: when the quantile of the mean (@mymath{q_{mean}}) is around
0.5.
-This definition of skewness through the quantile of the mean is further
introduced with a real image the tutorials, see @ref{Skewness caused by signal
and its measurement}.
-@cindex Signal-to-noise ratio
-However, in an astronomical image, some of the pixels will contain more signal
than the rest, so we cannot simply check @mymath{q_{mean}} on the whole dataset.
-For example, if we only look at the patch of pixels that are placed under the
central parts of the brightest stars in the field of view, @mymath{q_{mean}}
will be very high.
-The signal in other parts of the image will be weaker, and in some parts it
will be much smaller than the noise (for example, 1/100-th of the noise level).
-When the signal-to-noise ratio is very small, we can generally assume no
signal (because its effectively impossible to measure it) and @mymath{q_{mean}}
will be approximately 0.5.
+@node Histogram and Cumulative Frequency Plot, 2D Histograms, Statistics,
Statistics
+@subsection Histogram and Cumulative Frequency Plot
-To address this problem, we break the image into a grid of tiles@footnote{The
options to customize the tessellation are discussed in @ref{Processing
options}.} (see @ref{Tessellation}).
-For example, a tile can be a square box of size @mymath{30\times30} pixels.
-By measuring @mymath{q_{mean}} on each tile, we can find which tiles that
contain significant signal and ignore them.
-Technically, if a tile's @mymath{|q_{mean}-0.5|} is larger than the value
given to the @option{--meanmedqdiff} option, that tile will be ignored for the
next steps.
-You can read this option as ``mean-median-quantile-difference''.
+@cindex Histogram
+Histograms and the cumulative frequency plots are both used to visually study
the distribution of a dataset.
+A histogram shows the number of data points which lie within pre-defined
intervals (bins).
+So on the horizontal axis we have the bin centers and on the vertical, the
number of points that are in that bin.
+You can use it to get a general view of the distribution: which values have
been repeated the most? how close/far are the most significant bins? Are there
more values in the larger part of the range of the dataset, or in the lower
part? Similarly, many very important properties about the dataset can be
deduced from a visual inspection of the histogram.
+In the Statistics program, the histogram can be either output to a table to
plot with your favorite plotting program@footnote{
+We recommend @url{http://pgfplots.sourceforge.net/,PGFPlots} which generates
your plots directly within @TeX{} (the same tool that generates your
document).},
+or it can be shown with ASCII characters on the command-line, which is very
crude, but good enough for a fast and on-the-go analysis, see the example in
@ref{Invoking aststatistics}.
-@cindex Skewness
-@cindex Convolution
-The raw dataset's pixel distribution (in each tile) is noisy, to decrease the
noise/error in estimating @mymath{q_{mean}}, we convolve the image before
tessellation (see @ref{Convolution process}.
-Convolution decreases the range of the dataset and enhances its skewness, See
Section 3.1.1 and Figure 4 in @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa (2015)}.
-This enhanced skewness can be interpreted as an increase in the Signal to
noise ratio of the objects buried in the noise.
-Therefore, to obtain an even better measure of the presence of signal in a
tile, the mean and median discussed above are measured on the convolved image.
+@cindex Intervals, histogram
+@cindex Bin width, histogram
+@cindex Normalizing histogram
+@cindex Probability density function
+The width of the bins is only necessary parameter for a histogram.
+In the limiting case that the bin-widths tend to zero (while assuming the
number of points in the dataset tend to infinity), then the histogram will tend
to the @url{https://en.wikipedia.org/wiki/Probability_density_function,
probability density function} of the distribution.
+When the absolute number of points in each bin is not relevant to the study
(only the shape of the histogram is important), you can @emph{normalize} a
histogram so like the probability density function, the sum of all its bins
will be one.
-@cindex Cosmic rays
-There is one final hurdle: raw astronomical datasets are commonly peppered
with Cosmic rays.
-Images of Cosmic rays are not smoothed by the atmosphere or telescope
aperture, so they have sharp boundaries.
-Also, since they do not occupy too many pixels, they do not affect the mode
and median calculation.
-But their very high values can greatly bias the calculation of the mean
(recall how the mean shifts the fastest in the presence of outliers), for
example, see Figure 15 in @url{https://arxiv.org/abs/1505.01664, Akhlaghi and
Ichikawa (2015)}.
-The effect of outliers like cosmic rays on the mean and standard deviation can
be removed through @mymath{\sigma}-clipping, see @ref{Sigma clipping} for a
complete explanation.
+@cindex Cumulative Frequency Plot
+In the cumulative frequency plot of a distribution, the horizontal axis is the
sorted data values and the y axis is the index of each data in the sorted
distribution.
+Unlike a histogram, a cumulative frequency plot does not involve intervals or
bins.
+This makes it less prone to any sort of bias or error that a given bin-width
would have on the analysis.
+When a larger number of the data points have roughly the same value, then the
cumulative frequency plot will become steep in that vicinity.
+This occurs because on the horizontal axis, there is little change while on
the vertical axis, the indexes constantly increase.
+Normalizing a cumulative frequency plot means to divide each index (y axis) by
the total number of data points (or the last value).
-Therefore, after asserting that the mean and median are approximately equal in
a tile (see @ref{Tessellation}), the Sky and its STD are measured on each tile
after @mymath{\sigma}-clipping with the @option{--sigmaclip} option (see
@ref{Sigma clipping}).
-In the end, some of the tiles will pass the test and will be given a value.
-Others (that had signal in them) will just be assigned a NaN (not-a-number)
value.
-But we need a measurement over each tile (and thus pixel).
-We will therefore use interpolation to assign a value to the NaN tiles.
+Unlike the histogram which has a limited number of bins, ideally the
cumulative frequency plot should have one point for every data element.
+Even in small datasets (for example, a @mymath{200\times200} image) this will
result in an unreasonably large number of points to plot (40000)! As a result,
for practical reasons, it is common to only store its value on a certain number
of points (intervals) in the input range rather than the whole dataset, so you
should determine the number of bins you want when asking for a cumulative
frequency plot.
+In Gnuastro (and thus the Statistics program), the number reported for each
bin is the total number of data points until the larger interval value for that
bin.
+You can see an example histogram and cumulative frequency plot of a single
dataset under the @option{--asciihist} and @option{--asciicfp} options of
@ref{Invoking aststatistics}.
-However, prior to interpolating over the failed tiles, another point should be
considered: large and extended galaxies, or bright stars, have wings which sink
into the noise very gradually.
-In some cases, the gradient over these wings can be on scales that is larger
than the tiles (for example, the pixel value changes by @mymath{0.1\sigma}
after 100 pixels, but the tile has a width of 30 pixels).
+So as a summary, both the histogram and cumulative frequency plot in
Statistics will work with bins.
+Within each bin/interval, the lower value is considered to be within then bin
(it is inclusive), but its larger value is not (it is exclusive).
+Formally, an interval/bin between a and b is represented by [a, b).
+When the over-all range of the dataset is specified (with the
@option{--greaterequal}, @option{--lessthan}, or @option{--qrange} options),
the acceptable values of the dataset are also defined with a similar
inclusive-exclusive manner.
+But when the range is determined from the actual dataset (none of these
options is called), the last element in the dataset is included in the last
bin's count.
-In such cases, the @mymath{q_{mean}} test will be successful, even though
there is signal.
-Recall that @mymath{q_{mean}} is a measure of skewness.
-If we do not identify (and thus set to NaN) such outlier tiles before the
interpolation, the photons of the outskirts of the objects will leak into the
detection thresholds or Sky and Sky STD measurements and bias our result, see
@ref{Detecting large extended targets}.
-Therefore, the final step of ``quantifying signal in a tile'' is to look at
this distribution of successful tiles and remove the outliers.
-@mymath{\sigma}-clipping is a good solution for removing a few outliers, but
the problem with outliers of this kind is that there may be many such tiles
(depending on the large/bright stars/galaxies in the image).
-We therefore apply the following local outlier rejection strategy.
+@node 2D Histograms, Sigma clipping, Histogram and Cumulative Frequency Plot,
Statistics
+@subsection 2D Histograms
+@cindex 2D histogram
+@cindex Histogram, 2D
+In @ref{Histogram and Cumulative Frequency Plot} the concept of histograms
were introduced on a single dataset.
+But they are only useful for viewing the distribution of a single variable
(column in a table).
+In many contexts, the distribution of two variables in relation to each other
may be of interest.
+For example, the color-magnitude diagrams in astronomy, where the horizontal
axis is the luminosity or magnitude of an object, and the vertical axis is the
color.
+Scatter plots are useful to see these relations between the objects of
interest when the number of the objects is small.
-For each tile, we find the nearest @mymath{N_{ngb}} tiles that had a usable
value (@mymath{N_{ngb}} is the value given to @option{--outliernumngb}).
-We then sort them and find the difference between the largest and
second-to-smallest elements (The minimum is not used because the scatter can be
large).
-Let's call this the tile's @emph{slope} (measured from its neighbors).
-All the tiles that are on a region of flat noise will have similar slope
values, but if a few tiles fall on the wings of a bright star or large galaxy,
their slope will be significantly larger than the tiles with no signal.
-We just have to find the smallest tile slope value that is an outlier compared
to the rest, and reject all tiles with a slope larger than that.
+As the density of points in the scatter plot increases, the points will fall
over each other and just make a large connected region hide potentially
interesting behaviors/correlations in the densest regions.
+This is where 2D histograms can become very useful.
+A 2D histogram is composed of 2D bins (boxes or pixels), just as a 1D
histogram consists of 1D bins (lines).
+The number of points falling within each box/pixel will then be the value of
that box.
+Added with a color-bar, you can now clearly see the distribution independent
of the density of points (for example, you can even visualize it in log-scale
if you want).
-@cindex Outliers
-@cindex Identifying outliers
-To identify the smallest outlier, we will use the distribution of distances
between sorted elements.
-Let's assume the total number of tiles with a good mean-median quantile
difference is @mymath{N}.
-They are first sorted and searching for the outlier starts on element
@mymath{N/3} (integer division).
-Let's take @mymath{v_i} to be the @mymath{i}-th element of the sorted input
(with no blank values) and @mymath{m} and @mymath{\sigma} as the
@mymath{\sigma}-clipped median and standard deviation from the distances of the
previous @mymath{N/3-1} elements (not including @mymath{v_i}).
-If the value given to @option{--outliersigma} is displayed with @mymath{s},
the @mymath{i}-th element is considered as an outlier when the condition below
is true.
+Gnuastro's Statistics program has the @option{--histogram2d} option for this
task.
+It takes a single argument (either @code{table} or @code{image}) that
specifies the format of the output 2D histogram.
+The two formats will be reviewed separately in the sub-sections below.
+But let's start with the generalities that are common to both (related to the
input, not the output).
+
+You can specify the two columns to be shown using the @option{--column} (or
@option{-c}) option.
+So if you want to plot the color-magnitude diagram from a table with the
@code{MAG-R} column on the horizontal and @code{COLOR-G-R} on the vertical
column, you can use @option{--column=MAG-r,COLOR-G-r}.
+The number of bins along each dimension can be set with @option{--numbins}
(for first input column) and @option{--numbins2} (for second input column).
-@dispmath{{(v_i-v_{i-1})-m\over \sigma}>s}
+Without specifying any range, the full range of values will be used in each
dimension.
+If you only want to focus on a certain interval of the values in the columns
in any dimension you can use the @option{--greaterequal} and
@option{--lessthan} options to limit the values along the first/horizontal
dimension and @option{--greaterequal2} and @option{--lessthan2} options for the
second/vertical dimension.
-@noindent
-Since @mymath{i} begins from the @mymath{N/3}-th element in the sorted array
(a quantile of @mymath{1/3=0.33}), the outlier has to be larger than the
@mymath{0.33} quantile value of the dataset (this is usually the case;
otherwise, it is hard to define it as an ``outlier''!).
+@menu
+* 2D histogram as a table for plotting:: Format and usage in table format.
+* 2D histogram as an image:: Format and usage in image format
+@end menu
-@cindex Bicubic interpolation
-@cindex Interpolation, bicubic
-@cindex Nearest-neighbor interpolation
-@cindex Interpolation, nearest-neighbor
-Once the outlying tiles have been successfully identified and set to NaN, we
use nearest-neighbor interpolation to give a value to all tiles in the image.
-We do not use parametric interpolation methods (like bicubic), because they
will effectively extrapolate on the edges, creating strong artifacts.
-Nearest-neighbor interpolation is very simple: for each tile, we find the
@mymath{N_{ngb}} nearest tiles that had a good value, the tile's value is found
by estimating the median.
-You can set @mymath{N_{ngb}} through the @option{--interpnumngb} option.
-Once all the tiles are given a value, a smoothing step is implemented to
remove the sharp value contrast that can happen on the edges of tiles.
-The size of the smoothing box is set with the @option{--smoothwidth} option.
+@node 2D histogram as a table for plotting, 2D histogram as an image, 2D
Histograms, 2D Histograms
+@subsubsection 2D histogram as a table for plotting
-As mentioned above, the process above is used for any of the basic
measurements (for example, identifying the quantile-based thresholds in
NoiseChisel, or the Sky value in Statistics).
-You can use the check-image feature of NoiseChisel or Statistics to inspect
the steps and visually see each step (all the options that start with
@option{--check}).
-For example, as mentioned in the @ref{NoiseChisel optimization} tutorial, when
given a dataset from a new instrument (with differing noise properties), we
highly recommend to use @option{--checkqthresh} in your first call and visually
inspect how the parameters above affect the final quantile threshold (e.g.,
have the wings of bright sources leaked into the threshold?).
-The same goes for the @option{--checksky} option of Statistics or NoiseChisel.
+When called with the @option{--histogram=table} option, Statistics will output
a table file with three columns that have the information of every box as a
column.
+If you asked for @option{--numbins=N} and @option{--numbins2=M}, all three
columns will have @mymath{M\times N} rows (one row for every box/pixel of the
2D histogram).
+The first and second columns are the position of the box along the first and
second dimensions.
+The third column has the number of input points that fall within that
box/pixel.
+For example, you can make high-quality plots within your paper (using the same
@LaTeX{} engine, thus blending very nicely with your text) using
@url{https://ctan.org/pkg/pgfplots, PGFPlots}.
+Below you can see one such minimal example, using your favorite text editor,
save it into a file, make the two small corrections in it, then run the
commands shown at the top.
+This assumes that you have @LaTeX{} installed, if not the steps to install a
minimally sufficient @LaTeX{} package on your system, see the respective
section in @ref{Bootstrapping dependencies}.
+The two parts that need to be corrected are marked with '@code{%% <--}': the
first one (@code{XXXXXXXXX}) should be replaced by the value to the
@option{--numbins} option which is the number of bins along the first dimension.
+The second one (@code{FILE.txt}) should be replaced with the name of the file
generated by Statistics.
+@example
+%% Replace 'XXXXXXXXX' with your selected number of bins in the first
+%% dimension.
+%%
+%% Then run these commands to build the plot in a LaTeX command.
+%% mkdir tikz
+%% pdflatex --shell-escape --halt-on-error report.tex
+\documentclass@{article@}
+%% Load PGFPlots and set it to build the figure separately in a 'tikz'
+%% directory (which has to exist before LaTeX is run). This
+%% "externalization" is very useful to include the commands of multiple
+%% plots in the middle of your paper/report, but also have the plots
+%% separately to use in slides or other scenarios.
+\usepackage@{pgfplots@}
+\usetikzlibrary@{external@}
+\tikzexternalize
+\tikzsetexternalprefix@{tikz/@}
+%% Define colormap for the PGFPlots 2D histogram
+\pgfplotsset@{
+ /pgfplots/colormap=@{hsvwhitestart@}@{
+ rgb255(0cm)=(255,255,255)
+ rgb255(0.10cm)=(128,0,128)
+ rgb255(0.5cm)=(0,0,230)
+ rgb255(1.cm)=(0,255,255)
+ rgb255(2.5cm)=(0,255,0)
+ rgb255(3.5cm)=(255,255,0)
+ rgb255(6cm)=(255,0,0)
+ @}
+@}
+%% Start the prinable document
+\begin@{document@}
+ You can write a full paper here and include many figures!
+ Describe what the two axes are, and how you measured them.
+ Also, do not forget to explain what it shows and how to interpret it.
+ You also have separate PDFs for every figure in the `tikz' directory.
+ Feel free to change this text.
+ %% Draw the plot.
+ \begin@{tikzpicture@}
+ \small
+ \begin@{axis@}[
+ width=\linewidth,
+ view=@{0@}@{90@},
+ colorbar horizontal,
+ xlabel=X axis,
+ ylabel=Y axis,
+ ylabel shift=-0.1cm,
+ colorbar style=@{at=@{(0,1.01)@}, anchor=south west,
+ xticklabel pos=upper@},
+ ]
+ \addplot3[
+ surf,
+ shader=flat corner,
+ mesh/ordering=rowwise,
+ mesh/cols=XXXXXXXXX, %% <-- Number of bins in 1st column.
+ ] file @{FILE.txt@}; %% <-- Name of aststatistics output.
+ \end@{axis@}
+\end@{tikzpicture@}
-@node Invoking aststatistics, , Sky value, Statistics
-@subsection Invoking Statistics
+%% End the printable document.
+\end@{document@}
+@end example
-Statistics will print statistical measures of an input dataset (table column
or image).
-The executable name is @file{aststatistics} with the following general template
+Let's assume you have put the @LaTeX{} source above, into a plain-text file
called @file{report.tex}.
+The PGFPlots call above is configured to build the plots as separate PDF files
in a @file{tikz/} directory@footnote{@url{https://www.ctan.org/pkg/pgf, TiKZ}
is the name of the lower-level engine behind PGPlots.}.
+This allows you to directly load those PDFs in your slides or other reports.
+Therefore, before building the PDF report, you should first make a
@file{tikz/} directory:
@example
-$ aststatistics [OPTION ...] InputImage.fits
+$ mkdir tikz
@end example
-@noindent
-One line examples:
+To build the final PDF, you should run @command{pdflatex} with the
@option{--shell-escape} option, so it can build the separate PDF(s) separately.
+We are also adding the @option{--halt-on-error} so it immediately aborts in
the case of an error (in the case of an error, by default @LaTeX{} will not
abort, it will stop and ask for your input to temporarily change things and try
fixing the error, but it has a special interface which can be hard to master).
@example
-## Print some general statistics of input image:
-$ aststatistics image.fits
+$ pdflatex --shell-escape --halt-on-error report.tex
+@end example
-## Print some general statistics of column named MAG_F160W:
-$ aststatistics catalog.fits -h1 --column=MAG_F160W
+@noindent
+You can now open @file{report.pdf} to see your very high quality 2D histogram
within your text.
+And if you need the plots separately (for example, for slides), you can take
the PDF inside the @file{tikz/} directory.
-## Make the histogram of the column named MAG_F160W:
-$ aststatistics table.fits -cMAG_F160W --histogram
+@node 2D histogram as an image, , 2D histogram as a table for plotting, 2D
Histograms
+@subsubsection 2D histogram as an image
-## Find the Sky value on image with a given kernel:
-$ aststatistics image.fits --sky --kernel=kernel.fits
+When called with the @option{--histogram=image} option, Statistics will output
a FITS file with an image/array extension.
+If you asked for @option{--numbins=N} and @option{--numbins2=M} the image will
have a size of @mymath{N\times M} pixels (one pixel per 2D bin).
+Also, the FITS image will have a linear WCS that is scaled to the 2D bin size
along each dimension.
+So when you hover your mouse over any part of the image with a FITS viewer
(for example, SAO DS9), besides the number of points in each pixel, you can
directly also see ``coordinates'' of the pixels along the two axes.
+You can also use the optimized and fast FITS viewer features for many aspects
of visually inspecting the distributions (which we will not go into further).
-## Print Sigma-clipped results of records with a MAG_F160W
-## column value between 26 and 27:
-$ aststatistics cat.fits -cMAG_F160W -g26 -l27 --sigmaclip=3,0.2
+@cindex Color-magnitude diagram
+@cindex Diagram, Color-magnitude
+For example, let's assume you want to derive the color-magnitude diagram (CMD)
of the @url{http://uvudf.ipac.caltech.edu, UVUDF survey}.
+You can run the first command below to download the table with magnitudes of
objects in many filters and run the second command to see general column
metadata after it is downloaded.
-## Find the polynomial (to third order) that best fits the X and Y
-## columns of 'table.fits'. Robust fitting will be used to reject
-## outliers. Also, estimate the fitted polynomial on the same input
-## column (with errors).
-$ aststatistics table.fits --fit=polynomial-robust --fitmaxpower=3 \
- -cX,Y --fitestimate=self --output=estimated.fits
+@example
+$ wget http://asd.gsfc.nasa.gov/UVUDF/uvudf_rafelski_2015.fits.gz
+$ asttable uvudf_rafelski_2015.fits.gz -i
+@end example
-## Print the median value of all records in column MAG_F160W that
-## have a value larger than 3 in column PHOTO_Z:
-$ aststatistics tab.txt -rPHOTO_Z -g3 -cMAG_F160W --median
+Let's assume you want to find the color to be between the @code{F606W} and
@code{F775W} filters (roughly corresponding to the g and r filters in
ground-based imaging).
+However, the original table does not have color columns (there would be too
many combinations!).
+Therefore you can use the @ref{Column arithmetic} feature of Gnuastro's Table
program for deriving a new table with the @code{F775W} magnitude in one column
and the difference between the @code{F606W} and @code{F775W} on the other
column.
+With the second command, you can see the actual values if you like.
-## Calculate the median of the third column in the input table, but only
-## for rows where the mean of the first and second columns is >5.
-$ awk '($1+$2)/2 > 5 @{print $3@}' table.txt | aststatistics --median
+@example
+$ asttable uvudf_rafelski_2015.fits.gz -cMAG_F775W \
+ -c'arith MAG_F606W MAG_F775W -' \
+ --colmetadata=ARITH_1,F606W-F775W,"AB mag" -ocmd.fits
+$ asttable cmd.fits
@end example
@noindent
-@cindex Standard input
-Statistics can take its input dataset either from a file (image or table) or
the Standard input (see @ref{Standard input}).
-If any output file is to be created, the value to the @option{--output}
option, is used as the base name for the generated files.
-Without @option{--output}, the input name will be used to generate an output
name, see @ref{Automatic output}.
-The options described below are particular to Statistics, but for general
operations, it shares a large collection of options with the other Gnuastro
programs, see @ref{Common options} for the full list.
-For more on reading from standard input, please see the description of
@code{--stdintimeout} option in @ref{Input output options}.
-Options can also be given in configuration files, for more, please see
@ref{Configuration files}.
-
-The input dataset may have blank values (see @ref{Blank pixels}), in this
case, all blank pixels are ignored during the calculation.
-Initially, the full dataset will be read, but it is possible to select a
specific range of data elements to use in the analysis of each run.
-You can either directly specify a minimum and maximum value for the range of
data elements to use (with @option{--greaterequal} or @option{--lessthan}), or
specify the range using quantiles (with @option{--qrange}).
-If a range is specified, all pixels outside of it are ignored before any
processing.
-
-@cindex ASCII plot
-When no operation is requested, Statistics will print some general basic
properties of the input dataset on the command-line like the example below (ran
on one of the output images of @command{make check}@footnote{You can try it by
running the command in the @file{tests} directory, open the image with a FITS
viewer and have a look at it to get a sense of how these statistics relate to
the input image/dataset.}).
-This default behavior is designed to help give you a general feeling of how
the data are distributed and help in narrowing down your analysis.
+You can now construct your 2D histogram as a @mymath{100\times100} pixel FITS
image with this command (assuming you want @code{F775W} magnitudes between 22
and 30, colors between -1 and 3 and 100 bins in each dimension).
+Note that without the @option{--manualbinrange} option the range of each axis
will be determined by the values within the columns (which may be larger or
smaller than your desired large).
@example
-$ aststatistics convolve_spatial_scaled_noised.fits \
- --greaterequal=9500 --lessthan=11000
-Statistics (GNU Astronomy Utilities) X.X
--------
-Input: convolve_spatial_scaled_noised.fits (hdu: 0)
-Range: from (inclusive) 9500, upto (exclusive) 11000.
-Unit: counts
--------
- Number of elements: 9074
- Minimum: 9622.35
- Maximum: 10999.7
- Mode: 10055.45996
- Mode quantile: 0.4001983908
- Median: 10093.7
- Mean: 10143.98257
- Standard deviation: 221.80834
--------
-Histogram:
- | **
- | ******
- | *******
- | *********
- | *************
- | **************
- | ******************
- | ********************
- | *************************** *
- | ***************************************** ***
- |* **************************************************************
- |-----------------------------------------------------------------
+aststatistics cmd.fits -cMAG_F775W,F606W-F775W --histogram2d=image \
+ --numbins=100 --greaterequal=22 --lessthan=30 \
+ --numbins2=100 --greaterequal2=-1 --lessthan2=3 \
+ --manualbinrange --output=cmd-2d-hist.fits
@end example
-Gnuastro's Statistics is a very general purpose program, so to be able to
easily understand this diversity in its operations (and how to possibly run
them together), we will divided the operations into two types: those that do
not respect the position of the elements and those that do (by tessellating the
input on a tile grid, see @ref{Tessellation}).
-The former treat the whole dataset as one and can re-arrange all the elements
(for example, sort them), but the former do their processing on each tile
independently.
-First, we will review the operations that work on the whole dataset.
-
-@cindex AWK
-@cindex GNU AWK
-The group of options below can be used to get single value measurement(s) of
the whole dataset.
-They will print only the requested value as one field in a line/row, like the
@option{--mean}, @option{--median} options.
-These options can be called any number of times and in any order.
-The outputs of all such options will be printed on one line following each
other (with a space character between them).
-This feature makes these options very useful in scripts, or to redirect into
programs like GNU AWK for higher-level processing.
-These are some of the most basic measures, Gnuastro is still under heavy
development and this list will grow.
-If you want another statistical parameter, please contact us and we will do
out best to add it to this list, see @ref{Suggest new feature}.
-
-@menu
-* Input to Statistics:: How to specify the inputs to Statistics.
-* Single value measurements:: Can be used together (like --mean, or
--maximum).
-* Generating histograms and cumulative frequency plots:: Histogram and CFP
tables.
-* Fitting options:: Least squares fitting.
-* Contour options:: Table of contours.
-* Statistics on tiles:: Possible to do single-valued measurements on
tiles.
-@end menu
-
-@node Input to Statistics, Single value measurements, Invoking aststatistics,
Invoking aststatistics
-@subsubsection Input to Statistics
-
-The following set of options are for specifying the input/outputs of
Statistics.
-There are many other input/output options that are common to all Gnuastro
programs including Statistics, see @ref{Input output options} for those.
-
-@table @option
-
-@item -c STR/INT
-@itemx --column=STR/INT
-The column to use when the input file is a table with more than one column.
-See @ref{Selecting table columns} for a full description of how to use this
option.
-For more on how tables are read in Gnuastro, please see @ref{Tables}.
-
-@item -g FLT
-@itemx --greaterequal=FLT
-Limit the range of inputs into those with values greater and equal to what is
given to this option.
-None of the values below this value will be used in any of the processing
steps below.
+@noindent
+If you have SAO DS9, you can now open this FITS file as a normal FITS image,
for example, with the command below.
+Try hovering/zooming over the pixels: not only will you see the number of
objects in the UVUDF catalog that fall in each bin, but you also see the
@code{F775W} magnitude and color of that pixel also.
-@item -l FLT
-@itemx --lessthan=FLT
-Limit the range of inputs into those with values less-than what is given to
this option.
-None of the values greater or equal to this value will be used in any of the
processing steps below.
+@example
+$ ds9 cmd-2d-hist.fits -cmap sls -zoom to fit
+@end example
-@item -Q FLT[,FLT]
-@itemx --qrange=FLT[,FLT]
-Specify the range of usable inputs using the quantile.
-This option can take one or two quantiles to specify the range.
-When only one number is input (let's call it @mymath{Q}), the range will be
those values in the quantile range @mymath{Q} to @mymath{1-Q}.
-So when only one value is given, it must be less than 0.5.
-When two values are given, the first is used as the lower quantile range and
the second is used as the larger quantile range.
+@noindent
+With the first command below, you can activate the grid feature of DS9 to
actually see the coordinate grid, as well as values on each line.
+With the second command, DS9 will even read the labels of the axes and use
them to generate an almost publication-ready plot.
-@cindex Quantile
-The quantile of a given element in a dataset is defined by the fraction of its
index to the total number of values in the sorted input array.
-So the smallest and largest values in the dataset have a quantile of 0.0 and
1.0.
-The quantile is a very useful non-parametric (making no assumptions about the
input) relative measure to specify a range.
-It can best be understood in terms of the cumulative frequency plot, see
@ref{Histogram and Cumulative Frequency Plot}.
-The quantile of each horizontal axis value in the cumulative frequency plot is
the vertical axis value associate with it.
+@example
+$ ds9 cmd-2d-hist.fits -cmap sls -zoom to fit -grid yes
+$ ds9 cmd-2d-hist.fits -cmap sls -zoom to fit -grid yes \
+ -grid type publication
+@end example
-@end table
+If you are happy with the grid, coloring and the rest, you can also use ds9 to
save this as a JPEG image to directly use in your documents/slides with these
extra DS9 options (DS9 will write the image to @file{cmd-2d.jpeg} and quit
immediately afterwards):
-@node Single value measurements, Generating histograms and cumulative
frequency plots, Input to Statistics, Invoking aststatistics
-@subsubsection Single value measurements
+@example
+$ ds9 cmd-2d-hist.fits -cmap sls -zoom 4 -grid yes \
+ -grid type publication -saveimage cmd-2d.jpeg -quit
+@end example
-@table @option
+@cindex PGFPlots (@LaTeX{} package)
+This is good for a fast progress update.
+But for your paper or more official report, you want to show something with
higher quality.
+For that, you can use the PGFPlots package in @LaTeX{} to add axes in the same
font as your text, sharp grids and many other elegant/powerful features (like
over-plotting interesting points and lines).
+But to load the 2D histogram into PGFPlots first you need to convert the FITS
image into a more standard format, for example, PDF.
+We will use Gnuastro's @ref{ConvertType} for this, and use the
@code{sls-inverse} color map (which will map the pixels with a value of zero to
white):
-@item -n
-@itemx --number
-Print the number of all used (non-blank and in range) elements.
+@example
+$ astconvertt cmd-2d-hist.fits --colormap=sls-inverse \
+ --borderwidth=0 -ocmd-2d-hist.pdf
+@end example
-@item --minimum
-Print the minimum value of all used elements.
+@noindent
+Below you can see a minimally working example of how to add axis numbers,
labels and a grid to the PDF generated above.
+Copy and paste the @LaTeX{} code below into a plain-text file called
@file{cmd-report.tex}
+Notice the @code{xmin}, @code{xmax}, @code{ymin}, @code{ymax} values and how
they are the same as the range specified above.
-@item --maximum
-Print the maximum value of all used elements.
+@example
+\documentclass@{article@}
+\usepackage@{pgfplots@}
+\dimendef\prevdepth=0
+\begin@{document@}
-@item --sum
-Print the sum of all used elements.
+You can write all you want here...
-@item -m
-@itemx --mean
-Print the mean (average) of all used elements.
+\begin@{tikzpicture@}
+ \begin@{axis@}[
+ enlargelimits=false,
+ grid,
+ axis on top,
+ width=\linewidth,
+ height=\linewidth,
+ xlabel=@{Magnitude (F775W)@},
+ ylabel=@{Color (F606W-F775W)@}]
-@item -t
-@itemx --std
-Print the standard deviation of all used elements.
+ \addplot graphics[xmin=22, xmax=30, ymin=-1, ymax=3]
+ @{cmd-2d-hist.pdf@};
+ \end@{axis@}
+\end@{tikzpicture@}
+\end@{document@}
+@end example
-@item -E
-@itemx --median
-Print the median of all used elements.
+@noindent
+Run this command to build your PDF (assuming you have @LaTeX{} and PGFPlots).
-@item -u FLT[,FLT[,...]]
-@itemx --quantile=FLT[,FLT[,...]]
-Print the values at the given quantiles of the input dataset.
-Any number of quantiles may be given and one number will be printed for each.
-Values can either be written as a single number or as fractions, but must be
between zero and one (inclusive).
-Hence, in effect @command{--quantile=0.25 --quantile=0.75} is equivalent to
@option{--quantile=0.25,3/4}, or @option{-u1/4,3/4}.
+@example
+$ pdflatex cmd-report.tex
+@end example
-The returned value is one of the elements from the dataset.
-Taking @mymath{q} to be your desired quantile, and @mymath{N} to be the total
number of used (non-blank and within the given range) elements, the returned
value is at the following position in the sorted array:
@mymath{round(q\times{}N}).
+The improved quality, blending in with the text, vector-graphics resolution
and other features make this plot pleasing to the eye, and let your readers
focus on the main point of your scientific argument.
+PGFPlots can also built the PDF of the plot separately from the rest of the
paper/report, see @ref{2D histogram as a table for plotting} for the necessary
changes in the preamble.
-@item --quantfunc=FLT[,FLT[,...]]
-Print the quantiles of the given values in the dataset.
-This option is the inverse of the @option{--quantile} and operates similarly
except that the acceptable values are within the range of the dataset, not
between 0 and 1.
-Formally it is known as the ``Quantile function''.
+@node Sigma clipping, Least squares fitting, 2D Histograms, Statistics
+@subsection Sigma clipping
-Since the dataset is not continuous this function will find the nearest
element of the dataset and use its position to estimate the quantile function.
+Let's assume that you have pure noise (centered on zero) with a clear
@url{https://en.wikipedia.org/wiki/Normal_distribution,Gaussian distribution},
or see @ref{Photon counting noise}.
+Now let's assume you add very bright objects (signal) on the image which have
a very sharp boundary.
+By a sharp boundary, we mean that there is a clear cutoff (from the noise) at
the pixels the objects finish.
+In other words, at their boundaries, the objects do not fade away into the
noise.
+In such a case, when you plot the histogram (see @ref{Histogram and Cumulative
Frequency Plot}) of the distribution, the pixels relating to those objects will
be clearly separate from pixels that belong to parts of the image that did not
have any signal (were just noise).
+In the cumulative frequency plot, after a steady rise (due to the noise), you
would observe a long flat region were for a certain range of data (horizontal
axis), there is no increase in the index (vertical axis).
-@item --quantofmean
-@cindex Quantile of the mean
-Print the quantile of the mean in the dataset.
-This is a very good measure of detecting skewness or outliers.
-The concept is used by programs like NoiseChisel to identify the presence of
signal in a tile of the image (because signal in noise causes skewness).
+@cindex Blurring
+@cindex Cosmic rays
+@cindex Aperture blurring
+@cindex Atmosphere blurring
+Outliers like the example above can significantly bias the measurement of
noise statistics.
+@mymath{\sigma}-clipping is defined as a way to avoid the effect of such
outliers.
+In astronomical applications, cosmic rays (when they collide at a near normal
incidence angle) are a very good example of such outliers.
+The tracks they leave behind in the image are perfectly immune to the blurring
caused by the atmosphere and the aperture.
+They are also very energetic and so their borders are usually clearly
separated from the surrounding noise.
+So @mymath{\sigma}-clipping is very useful in removing their effect on the
data.
+See Figure 15 in Akhlaghi and Ichikawa,
@url{https://arxiv.org/abs/1505.01664,2015}.
-For example, take this simple array: @code{1 2 20 4 5 6 3}.
-The mean is @code{5.85}.
-The nearest element to this mean is @code{6} and the quantile of @code{6} in
this distribution is 0.8333.
-Here is how we got to this: in the sorted dataset (@code{1 2 3 4 5 6 20}),
@code{6} is the 5-th element (counting from zero, since a quantile of zero
corresponds to the minimum, by definition) and the maximum is the 6-th element
(again, counting from zero).
-So the quantile of the mean in this case is @mymath{5/6=0.8333}.
+@mymath{\sigma}-clipping is defined as the very simple iteration below.
+In each iteration, the range of input data might decrease and so when the
outliers have the conditions above, the outliers will be removed through this
iteration.
+The exit criteria will be discussed below.
-In the example above, if we had @code{7} instead of @code{20} (which was an
outlier), then the mean would be @code{4} and the quantile of the mean would be
0.5 (which by definition, is the quantile of the median), showing no outliers.
-As the number of elements increases, the mean itself is less affected by a
small number of outliers, but skewness can be nicely identified by the quantile
of the mean.
+@enumerate
+@item
+Calculate the standard deviation (@mymath{\sigma}) and median (@mymath{m})
+of a distribution.
+@item
+Remove all points that are smaller or larger than
+@mymath{m\pm\alpha\sigma}.
+@item
+Go back to step 1, unless the selected exit criteria is reached.
+@end enumerate
-@item -O
-@itemx --mode
-Print the mode of all used elements.
-The mode is found through the mirror distribution which is fully described in
Appendix C of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
2015}.
-See that section for a full description.
+@noindent
+The reason the median is used as a reference and not the mean is that the mean
is too significantly affected by the presence of outliers, while the median is
less affected, see @ref{Quantifying signal in a tile}.
+As you can tell from this algorithm, besides the condition above (that the
signal have clear high signal to noise boundaries) @mymath{\sigma}-clipping is
only useful when the signal does not cover more than half of the full data set.
+If they do, then the median will lie over the outliers and
@mymath{\sigma}-clipping might remove the pixels with no signal.
-This mode calculation algorithm is non-parametric, so when the dataset is not
large enough (larger than about 1000 elements usually), or does not have a
clear mode it can fail.
-In such cases, this option will return a value of @code{nan} (for the floating
point NaN value).
+There are commonly two exit criteria to stop the @mymath{\sigma}-clipping
+iteration:
-As described in that paper, the easiest way to assess the quality of this mode
calculation method is to use it's symmetricity (see @option{--modesym} below).
-A better way would be to use the @option{--mirror} option to generate the
histogram and cumulative frequency tables for any given mirror value (the mode
in this case) as a table.
-If you generate plots like those shown in Figure 21 of that paper, then your
mode is accurate.
+@itemize
+@item
+When a certain number of iterations has taken place (second value to the
@option{--sclipparams} option is larger than 1).
+@item
+When the new measured standard deviation is within a certain tolerance level
of the old one (second value to the @option{--sclipparams} option is less than
1).
+The tolerance level is defined by:
-@item --modequant
-Print the quantile of the mode.
-You can get the actual mode value from the @option{--mode} described above.
-In many cases, the absolute value of the mode is irrelevant, but its position
within the distribution is important.
-In such cases, this option will become handy.
+@dispmath{\sigma_{old}-\sigma_{new} \over \sigma_{new}}
-@item --modesym
-Print the symmetricity of the calculated mode.
-See the description of @option{--mode} for more.
-This mode algorithm finds the mode based on how symmetric it is, so if the
symmetricity returned by this option is too low, the mode is not too accurate.
-See Appendix C of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
2015} for a full description.
-In practice, symmetricity values larger than 0.2 are mostly good.
+The standard deviation is used because it is heavily influenced by the
presence of outliers.
+Therefore the fact that it stops changing between two iterations is a sign
that we have successfully removed outliers.
+Note that in each clipping, the dispersion in the distribution is either less
or equal.
+So @mymath{\sigma_{old}\geq\sigma_{new}}.
+@end itemize
-@item --modesymvalue
-Print the value in the distribution where the mirror and input
-distributions are no longer symmetric, see @option{--mode} and Appendix C
-of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa 2015} for
-more.
+@cartouche
+@noindent
+When working on astronomical images, objects like galaxies and stars are
blurred by the atmosphere and the telescope aperture, therefore their signal
sinks into the noise very gradually.
+Galaxies in particular do not appear to have a clear high signal to noise
cutoff at all.
+Therefore @mymath{\sigma}-clipping will not be useful in removing their effect
on the data.
-@item --sigclip-number
-Number of elements after applying @mymath{\sigma}-clipping (see @ref{Sigma
clipping}).
-@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
+To gauge if @mymath{\sigma}-clipping will be useful for your dataset, look at
the histogram (see @ref{Histogram and Cumulative Frequency Plot}).
+The ASCII histogram that is printed on the command-line with
@option{--asciihist} is good enough in most cases.
+@end cartouche
-@item --sigclip-median
-Median after applying @mymath{\sigma}-clipping (see @ref{Sigma clipping}).
-@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
-@cindex Outlier
-Here is one scenario where this can be useful: assume you have a table and you
would like to remove the rows that are outliers (not within the
@mymath{\sigma}-clipping range).
-Let's assume your table is called @file{table.fits} and you only want to keep
the rows that have a value in @code{COLUMN} within the @mymath{\sigma}-clipped
range (to @mymath{3\sigma}, with a tolerance of 0.1).
-This command will return the @mymath{\sigma}-clipped median and standard
deviation (used to define the range later).
+@node Least squares fitting, Sky value, Sigma clipping, Statistics
+@subsection Least squares fitting
-@example
-$ aststatistics table.fits -cCOLUMN --sclipparams=3,0.1 \
- --sigclip-median --sigclip-std
-@end example
+@cindex Radial profile
+@cindex Least squares fitting
+@cindex Fitting (least squares)
+@cindex Star formation main sequence
+After completing a good observation, doing robust data reduction and
finalizing the measurements, it is commonly necessary to parameterize the
derived correlations.
+For example, you have derived the radial profile of the PSF of your image (see
@ref{Building the extended PSF}).
+You now want to parameterize the radial profile to estimate the slope.
+Alternatively, you may have found the star formation rate and stellar mass of
your sample of galaxies.
+Now, you want to derive the star formation main sequence as a parametric
relation between the two.
+The fitting functions below can be used for such purposes.
-@cindex GNU AWK
-You can then use the @option{--range} option of Table (see @ref{Table}) to
select the proper rows.
-But for that, you need the actual starting and ending values of the range
(@mymath{m\pm s\sigma}; where @mymath{m} is the median and @mymath{s} is the
multiple of sigma to define an outlier).
-Therefore, the raw outputs of Statistics in the command above are not enough.
+@cindex GSL
+@cindex GNU Scientific Library
+Gnuastro's least squares fitting features are just wrappers over the least
squares fitting methods of the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html, linear} and
@url{https://www.gnu.org/software/gsl/doc/html/nls.html, nonlinear}
least-squares fitting functions of the GNU Scientific Library (GSL).
+For the low-level details and equations of the methods, please see the GSL
documentation.
+The names have been preserved here in Gnuastro to simplify the connection with
GSL and follow the details in the detailed documentation there.
-To get the starting and ending values of the non-outlier range (and put a
`@key{,}' between them, ready to be used in @option{--range}), pipe the result
into AWK.
-But in AWK, we will also need the multiple of @mymath{\sigma}, so we will
define it as a shell variable (@code{s}) before calling Statistics (note how
@code{$s} is used two times now):
+GSL is a very low-level library, designed for maximal portability to many
scenarios, and power.
+Therefore calling GSL's functions directly for a fast operation requires a
good knowledge of the C programming language and many lines of code.
+As a low-level library, GSL is designed to be the back-end of higher-level
programs (like Gnuastro).
+Through the Statistics program, in Gnuastro we provide a high-level interface
to access to GSL's very powerful least squares fitting engine to read/write
from/to standard data formats in astronomy.
+A fully working example is shown below.
+
+@cindex Gaussian noise
+@cindex Noise (Gaussian)
+To activate fitting in Statistics, simply give your desired fitting method to
the @option{--fit} option (for the full list of acceptable methods, see
@ref{Fitting options}).
+For example, with the command below, we'll build a fake measurement table
(including noise) from the polynomial @mymath{y=1.23-4.56x+7.89x^2}.
+To understand how this equation translates to the command below (part before
@code{set-y}), see @ref{Reverse polish notation} and @ref{Column arithmetic}.
+We will set the X axis to have values from 0.1 to 2, with steps of 0.01 and
let's assume a random Gaussian noise to each @mymath{y} measurement:
@mymath{\sigma_y=0.1y}.
+To make the random number generation exactly reproducible, we are also setting
the seed (see @ref{Generating random numbers}, which also uses GSL as a
backend).
+To learn more about the @code{mknoise-sigma} operator, see the Arithmetic
program's @ref{Random number generators}.
@example
-$ s=3
-$ aststatistics table.fits -cCOLUMN --sclipparams=$s,0.1 \
- --sigclip-median --sigclip-std \
- | awk '@{s='$s'; printf("%f,%f\n", $1-s*$2, $1+s*$2)@}'
+$ export GSL_RNG_SEED=1664015492
+$ seq 0.1 0.01 2 \
+ | asttable --output=noisy.fits --envseed -c1 \
+ -c'arith 1.23 -4.56 $1 x + 7.89 $1 x $1 x + set-y \
+ 0.1 y x set-yerr \
+ y yerr mknoise-sigma yerr' \
+ --colmetadata=1,X --colmetadata=2,Y \
+ --colmetadata=3,Yerr
@end example
-To pass it onto Table, we will need to keep the printed output from the
command above in another shell variable (@code{r}), not print it.
-In Bash, can do this by putting the whole statement within a @code{$()}:
+@noindent
+Let's have a look at the output plot with TOPCAT using the command below.
@example
-$ s=3
-$ r=$(aststatistics table.fits -cCOLUMN --sclipparams=$s,0.1 \
- --sigclip-median --sigclip-std \
- | awk '@{s='$s'; printf("%f,%f\n", $1-s*$2, $1+s*$2)@}')
-$ echo $r # Just to confirm.
+$ astscript-fits-view noisy.fits
@end example
-Now you can use Table with the @option{--range} option to only print the rows
that have a value in @code{COLUMN} within the desired range:
+@noindent
+To see the error-bars, after opening the scatter plot, go into the ``Form''
tab for that plot.
+Click on the button with a green ``+'' sign followed by ``Forms'' and select
``XYError''.
+On the side-menu, in front of ``Y Positive Error'', select the @code{Yerr}
column of the input table.
+
+As you see, the error bars do indeed increase for higher X axis values.
+Since we have error bars in this example (as in any measurement), we can use
weighted fitting.
+Also, this isn't a linear relation, so we'll use a polynomial to second order
(a maximum power of 2 in the form of @mymath{Y=c_0+c_1X+c_2X^2}):
@example
-$ asttable table.fits --range=COLUMN,$r
-@end example
+$ aststatistics noisy.fits -cX,Y,Yerr --fit=polynomial-weighted \
+ --fitmaxpower=2
+Statistics (GNU Astronomy Utilities) @value{VERSION}
+-------
+Fitting results (remove extra info with '--quiet' or '-q)
+ Input file: noisy.fits (hdu: 1) with 191 non-blank rows.
+ X column: X
+ Y column: Y
+ Weight column: Yerr [Standard deviation of Y in each row]
-To save the resulting table (that is clean of outliers) in another file (for
example, named @file{cleaned.fits}, it can also have a @file{.txt} suffix),
just add @option{--output=cleaned.fits} to the command above.
+Fit function: Y = c0 + (c1 * X^1) + (c2 * X^2) + ... (cN * X^N)
+ N: 2
+ c0: +1.2286211608
+ c1: -4.5127796636
+ c2: +7.8435883943
+Covariance matrix:
+ +0.0010496001 -0.0039928488 +0.0028367390
+ -0.0039928488 +0.0175244127 -0.0138030778
+ +0.0028367390 -0.0138030778 +0.0128129806
-@item --sigclip-mean
-Mean after applying @mymath{\sigma}-clipping (see @ref{Sigma clipping}).
-@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
+Reduced chi^2 of fit:
+ +0.9740670090
+@end example
-@item --sigclip-std
-Standard deviation after applying @mymath{\sigma}-clipping (see @ref{Sigma
clipping}).
-@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
+As you see from the elaborate message, the weighted polynomial fitting has
found return the @mymath{c_0}, @mymath{c_1} and @mymath{c_2} of
@mymath{Y=c_0+c_1X+c_2X^2} that best represents the data we inserted.
+Our input values were @mymath{c_0=1.23}, @mymath{c_1=-4.56} and
@mymath{c_2=7.89}, and the fitted values are @mymath{c_0\approx1.2286},
@mymath{c_1\approx-4.5128} and @mymath{c_2\approx7.8436} (which is
statistically a very good fit! given that we knew the original values
a-priori!).
+The covariance matrix is also calculated, it is necessary to calculate error
bars on the estimations and contains a lot of information (e.g., possible
correlations between parameters).
+Finally, the reduced @mymath{\chi^2} (or @mymath{\chi_{red}^2}) of the fit is
also printed (which was the measure to minimize).
+A @mymath{\chi_{red}^2\approx1} shows a good fit.
+This is good for real-world scenarios when you don't know the original values
a-priori.
+For more on interpreting @mymath{\chi_{red}^2\approx1}, see
@url{https://arxiv.org/abs/1012.3754, Andrae et al (2010)}.
-@end table
+The comparison of fitted and input values look pretty good, but nothing beats
visual inspection!
+To see how this looks compared to the data, let's open the table again:
-@node Generating histograms and cumulative frequency plots, Fitting options,
Single value measurements, Invoking aststatistics
-@subsubsection Generating histograms and cumulative freq.
+@example
+$ astscript-fits-view noisy.fits
+@end example
-The list of options below are for those statistical operations that output
more than one value.
-So while they can be called together in one run, their outputs will be
distinct (each one's output will usually be printed in more than one line).
+Repeat the steps below to show the scatter plot and error-bars.
+Then, go to the ``Layers'' menu and select ``Add Function Control''.
+Use the results above to fill the box in front of ``Function Expression'':
@code{1.2286+(-4.5128*x)+(7.8436*x*x)}.
+You will see that the second order polynomial falls very nicely over the
points@footnote{After plotting, you will notice that the legend made the plot
too thin.
+Fortunately you have a lot of empty space within the plot.
+To bring the legend in, click on the ``Legend'' item on the bottom-left menu,
in the ``Location'' tab, click on ``Internal'' and hold and move it to the
top-left in the box below.
+To make the functional fit more clear, you can click on the ``Function'' item
of the bottom-left menu.
+In the ``Style'' tab, change the color and thickness.}.
+But this fit is not perfect: it also has errors (inherited from the
measurement errors).
+We need the covariance matrix to estimate the errors on each point, and that
can be complex to do by hand.
-@table @option
+Fortunately GSL has the tools to easily estimate the function at any point and
also calculate its corresponding error.
+To access this feature within Gnuastro's Statistics program, you should use
the @option{--fitestimate} option.
+You can either give an independent table file name (with
@option{--fitestimatehdu} and @option{--fitestimatecol} to specify the HDU and
column in that file), or just @code{self} so it uses the same X axis column
that was used in this fit.
+Let's use the easier case:
-@item -A
-@itemx --asciihist
-Print an ASCII histogram of the usable values within the input dataset along
with some basic information like the example below (from the UVUDF
catalog@footnote{@url{https://asd.gsfc.nasa.gov/UVUDF/uvudf_rafelski_2015.fits.gz}}).
-The width and height of the histogram (in units of character widths and
heights on your command-line terminal) can be set with the
@option{--numasciibins} (for the width) and @option{--asciiheight} options.
+@example
+$ aststatistics noisy.fits -cX,Y,Yerr --fit=polynomial-weighted \
+ --fitmaxpower=2 --fitestimate=self --output=est.fits
-For a full description of the histogram, please see @ref{Histogram and
Cumulative Frequency Plot}.
-An ASCII plot is certainly very crude and cannot be used in any publication,
but it is very useful for getting a general feeling of the input dataset very
fast and easily on the command-line without having to take your hands off the
keyboard (which is a major distraction!).
-If you want to try it out, you can write it all in one line and ignore the
@key{\} and extra spaces.
+...[[truncated; same as above]]...
-@example
-$ aststatistics uvudf_rafelski_2015.fits.gz --hdu=1 \
- --column=MAG_F160W --lessthan=40 \
- --asciihist --numasciibins=55
-ASCII Histogram:
-Number: 8593
-Y: (linear: 0 to 660)
-X: (linear: 17.7735 -- 31.4679, in 55 bins)
- | ****
- | *****
- | ******
- | ********
- | *********
- | ***********
- | **************
- | *****************
- | ***********************
- | ********************************
- |*** ***************************************************
- |-------------------------------------------------------
+Requested estimation:
+ Written to: est.fits
@end example
-@item --asciicfp
-Print the cumulative frequency plot of the usable elements in the input
dataset.
-Please see descriptions under @option{--asciihist} for more, the example below
is from the same input table as that example.
-To better understand the cumulative frequency plot, please see @ref{Histogram
and Cumulative Frequency Plot}.
+The first lines of the printed text are the same as before.
+Afterwards, you will see a new line printed in the output, saying that the
estimation was written in @file{est.fits}.
+You can now inspect the two tables with TOPCAT again with the command below.
+After TOPCAT opens, plot both scatter plots:
@example
-$ aststatistics uvudf_rafelski_2015.fits.gz --hdu=1 \
- --column=MAG_F160W --lessthan=40 \
- --asciicfp --numasciibins=55
-ASCII Cumulative frequency plot:
-Y: (linear: 0 to 8593)
-X: (linear: 17.7735 -- 31.4679, in 55 bins)
- | *******
- | **********
- | ***********
- | *************
- | **************
- | ***************
- | *****************
- | *******************
- | ***********************
- | ******************************
- |*******************************************************
- |-------------------------------------------------------
+$ astscript-fits-view noisy.fits est.fits
@end example
-@item -H
-@itemx --histogram
-Save the histogram of the usable values in the input dataset into a table.
-The first column is the value at the center of the bin and the second is the
number of points in that bin.
-If the @option{--cumulative} option is also called with this option in a run,
then the table will have three columns (the third is the cumulative frequency
plot).
-Through the @option{--numbins}, @option{--onebinstart}, or
@option{--manualbinrange}, you can modify the first column values and with
@option{--normalize} and @option{--maxbinone} you can modify the second columns.
-See below for the description of each.
+It is clear that they fall nicely on top of each other.
+The @file{est.fits} table also has a third column with error bars.
+You can follow the same steps before and draw the error bars to see how they
compare with the scatter of the measured data.
+They are much smaller than the error in each point because we had a very good
sampling of the function in our noisy data.
-By default (when no @option{--output} is specified) a plain text table will be
created, see @ref{Gnuastro text table format}.
-If a FITS name is specified, you can use the common option
@option{--tableformat} to have it as a FITS ASCII or FITS binary format, see
@ref{Common options}.
-This table can then be fed into your favorite plotting tool and get a much
more clean and nice histogram than what the raw command-line can offer you
(with the @option{--asciihist} option).
+Another useful point with the estimated output file is that it contains all
the fitting outputs as keywords in the header:
-@item --histogram2d
-Save the 2D histogram of two input columns into an output file, see @ref{2D
Histograms}.
-The output will have three columns: the first two are the coordinates of each
box's center in the first and second dimensions/columns.
-The third will be number of input points that fall within that box.
+@example
+$ astfits est.fits -h1
+...[[truncated]]...
-@item -C
-@itemx --cumulative
-Save the cumulative frequency plot of the usable values in the input dataset
into a table, similar to @option{--histogram}.
+ / Fit results
+FITTYPE = 'polynomial-weighted' / Functional form of the fitting.
+FITMAXP = 2 / Maximum power of polynomial.
+FITIN = 'noisy.fits' / Name of file with input columns.
+FITINHDU= '1 ' / Name or Number of HDU with input cols.
+FITXCOL = 'X ' / Name or Number of independent (X) col.
+FITYCOL = 'Y ' / Name or Number of measured (Y) column.
+FITWCOL = 'Yerr ' / Name or Number of weight column.
+FITWNAT = 'Standard deviation' / Nature of weight column.
+FRDCHISQ= 0.974067008958516 / Reduced chi^2 of fit.
+FITC0 = 1.22862116084727 / C0: multiple of x^0 in polynomial
+FITC1 = -4.51277966356177 / C1: multiple of x^1 in polynomial
+FITC2 = 7.84358839431161 / C2: multiple of x^2 in polynomial
+FCOV11 = 0.00104960011629718 / Covariance matrix element (1,1).
+FCOV12 = -0.00399284880859776 / Covariance matrix element (1,2).
+FCOV13 = 0.00283673901863388 / Covariance matrix element (1,3).
+FCOV21 = -0.00399284880859776 / Covariance matrix element (2,1).
+FCOV22 = 0.0175244126670659 / Covariance matrix element (2,2).
+FCOV23 = -0.0138030778380786 / Covariance matrix element (2,3).
+FCOV31 = 0.00283673901863388 / Covariance matrix element (3,1).
+FCOV32 = -0.0138030778380786 / Covariance matrix element (3,2).
+FCOV33 = 0.0128129806394559 / Covariance matrix element (3,3).
-@item -s
-@itemx --sigmaclip
-Do @mymath{\sigma}-clipping on the usable pixels of the input dataset.
-See @ref{Sigma clipping} for a full description on @mymath{\sigma}-clipping
and also to better understand this option.
-The @mymath{\sigma}-clipping parameters can be set through the
@option{--sclipparams} option (see below).
+...[[truncated]]...
+@end example
-@item --mirror=FLT
-Make a histogram and cumulative frequency plot of the mirror distribution for
the given dataset when the mirror is located at the value to this option.
-The mirror distribution is fully described in Appendix C of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa 2015} and
currently it is only used to calculate the mode (see @option{--mode}).
+In scenarios were you don't want the estimation, but only the fitted
parameters, all that verbose, human-friendly text or FITS keywords can be an
annoying extra step.
+For such cases, you should use the @option{--quiet} option like below.
+It will print the parameters, rows of the covariance matrix and
@mymath{\chi_{red}^2} on separate lines with nothing extra.
+This allows you to parse the values in any way that you would like.
-Just note that the mirror distribution is a discrete distribution like the
input, so while you may give any number as the value to this option, the actual
mirror value is the closest number in the input dataset to this value.
-If the two numbers are different, Statistics will warn you of the actual
mirror value used.
+@example
+$ aststatistics noisy.fits -cX,Y,Yerr --fit=polynomial-weighted \
+ --fitmaxpower=2 --quiet
++1.2286211608 -4.5127796636 +7.8435883943
++0.0010496001 -0.0039928488 +0.0028367390
+-0.0039928488 +0.0175244127 -0.0138030778
++0.0028367390 -0.0138030778 +0.0128129806
++0.9740670090
+@end example
-This option will make a table as output.
-Depending on your selected name for the output, it will be either a FITS table
or a plain text table (which is the default).
-It contains three columns: the first is the center of the bins, the second is
the histogram (with the largest value set to 1) and the third is the normalized
cumulative frequency plot of the mirror distribution.
-The bins will be positioned such that the mode is on the starting interval of
one of the bins to make it symmetric around the mirror.
-With this output file and the input histogram (that you can generate in
another run of Statistics, using the @option{--onebinvalue}), it is possible to
make plots like Figure 21 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa 2015}.
+As a final example, because real data usually have outliers, let's look at the
``robust'' polynomial fit which has special features to remove outliers.
+First, we need to add some outliers to the table.
+To do this, we'll make a plain-text table with @command{echo}, and use Table's
@option{--catrowfile} to concatenate (or append) those two rows to the original
table.
+Finally, we'll run the same fitting step above:
-@end table
+@example
+$ echo "0.6 20 0.01" > outliers.txt
+$ echo "0.8 20 0.01" >> outliers.txt
-The list of options below allow customization of the histogram and cumulative
frequency plots (for the @option{--histogram}, @option{--cumulative},
@option{--asciihist}, and @option{--asciicfp} options).
+$ asttable noisy.fits --catrowfile=outliers.txt \
+ --output=with-outlier.fits
-@table @option
+$ aststatistics with-outlier.fits -cX,Y,Yerr --fit=polynomial-weighted \
+ --fitmaxpower=2 --fitestimate=self \
+ --output=est-out.fits
+
+Statistics (GNU Astronomy Utilities) @value{VERSION}
+-------
+Fitting results (remove extra info with '--quiet' or '-q)
+ Input file: with-outlier.fits (hdu: 1) with 193 non-blank rows.
+ X column: X
+ Y column: Y
+ Weight column: Yerr [Standard deviation of Y in each row]
-@item --numbins
-The number of bins (rows) to use in the histogram and the cumulative frequency
plot tables (outputs of @option{--histogram} and @option{--cumulative}).
+Fit function: Y = c0 + (c1 * X^1) + (c2 * X^2) + ... (cN * X^N)
+ N: 2
+ c0: -13.6446036899
+ c1: +66.8463258547
+ c2: -30.8746303591
-@item --numasciibins
-The number of bins (characters) to use in the ASCII plots when printing the
histogram and the cumulative frequency plot (outputs of @option{--asciihist}
and @option{--asciicfp}).
+Covariance matrix:
+ +0.0007889160 -0.0027706310 +0.0022208939
+ -0.0027706310 +0.0113922468 -0.0100306732
+ +0.0022208939 -0.0100306732 +0.0094087226
-@item --asciiheight
-The number of lines to use when printing the ASCII histogram and cumulative
frequency plot on the command-line (outputs of @option{--asciihist} and
@option{--asciicfp}).
+Reduced chi^2 of fit:
+ +4501.8356719150
-@item -n
-@itemx --normalize
-Normalize the histogram or cumulative frequency plot tables (outputs of
@option{--histogram} and @option{--cumulative}).
-For a histogram, the sum of all bins will become one and for a cumulative
frequency plot the last bin value will be one.
+Requested estimation:
+ Written to: est-out.fit
+@end example
-@item --maxbinone
-Divide all the histogram values by the maximum bin value so it becomes one and
the rest are similarly scaled.
-In some situations (for example, if you want to plot the histogram and
cumulative frequency plot in one plot) this can be very useful.
+We see that the coefficient values have changed significantly and that
@mymath{\chi_{red}^2} has increased to @mymath{4501}!
+Recall that a good fit should have @mymath{\chi_{red}^2\approx1}.
+These numbers clearly show that the fit was bad, but again, nothing beats a
visual inspection.
+To visually see the effect of those outliers, let's plot them with the command
below.
+You see that those two points have clearly caused a turn in the fitted result
which is terrible.
-@item --onebinstart=FLT
-Make sure that one bin starts with the value to this option.
-In practice, this will shift the bins used to find the histogram and
cumulative frequency plot such that one bin's lower interval becomes this value.
+@example
+$ astscript-fits-view with-outlier.fits est-out.fits
+@end example
-For example, when a histogram range includes negative and positive values and
zero has a special significance in your analysis, then zero might fall
somewhere in one bin.
-As a result that bin will have counts of positive and negative.
-By setting @option{--onebinstart=0}, you can make sure that one bin will only
count negative values in the vicinity of zero and the next bin will only count
positive ones in that vicinity.
+For such cases, GSL has
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#robust-linear-regression,
Robust linear regression}.
+In Gnuastro's Statistics, you can access it with
@option{--fit=polynomial-robust}, like the example below.
+Just note that the robust method doesn't take an error column (because it
estimates the errors internally while rejecting outliers, based on the method).
-@cindex NaN
-Note that by default, the first row of the histogram and cumulative frequency
plot show the central values of each bin.
-So in the example above you will not see the 0.000 in the first column, you
will see two symmetric values.
+@example
+$ aststatistics with-outlier.fits -cX,Y --fit=polynomial-robust \
+ --fitmaxpower=2 --fitestimate=self \
+ --output=est-out.fits --quiet
-If the value is not within the usable input range, this option will be ignored.
-When it is, this option is the last operation before the bins are finalized,
therefore it has a higher priority than options like @option{--manualbinrange}.
+$ astfits est-out.fits -h1 | grep ^FITC
+FITC0 = 1.20422691185238 / C0: multiple of x^0 in polynomial
+FITC1 = -4.4779253576348 / C1: multiple of x^1 in polynomial
+FITC2 = 7.84986153686548 / C2: multiple of x^2 in polynomial
-@item --manualbinrange
-Use the values given to the @option{--greaterequal} and @option{--lessthan} to
define the range of all bin-based calculations like the histogram.
-This option itself does not take any value, but just tells the program to use
the values of those two options instead of the minimum and maximum values of a
plot.
-If any of the two options are not given, then the minimum or maximum will be
used respectively.
-Therefore, if none of them are called calling this option is redundant.
+$ astscript-fits-view with-outlier.fits est-out.fits
+@end example
-The @option{--onebinstart} option has a higher priority than this option.
-In other words, @option{--onebinstart} takes effect after the range has been
finalized and the initial bins have been defined, therefore it has the power to
(possibly) shift the bins.
-If you want to manually set the range of the bins @emph{and} have one bin on a
special value, it is thus better to avoid @option{--onebinstart}.
+It is clear that the coefficients are very similar to the no-outlier scenario
above and if you run the second command to view the scatter plots on TOPCAT,
you also see that the fit nicely follows the curve and is not affected by those
two points.
+GSL provides many methods to reject outliers.
+For their full list, see the description of @option{--fitrobust} in
@ref{Fitting options}.
+For a description of the outlier rejection methods, see the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#c.gsl_multifit_robust_workspace,
GSL manual}.
-@item --numbins2=INT
-Similar to @option{--numbins}, but for the second column when a 2D histogram
is requested, see @option{--histogram2d}.
+You may have noticed that unlike the cases before the last Statistics command
above didn't print anything on the standard output.
+This is becasue @option{--quiet} and @option{--fitestimate} were called
together.
+In this case, because all the fitting parameters are written as FITS keywords,
because of the @option{--quiet} option, they are no longer printed on standard
output.
-@item --greaterequal2=FLT
-Similar to @option{--greaterequal}, but for the second column when a 2D
histogram is requested, see @option{--histogram2d}.
+@node Sky value, Invoking aststatistics, Least squares fitting, Statistics
+@subsection Sky value
-@item --lessthan2=FLT
-Similar to @option{--lessthan}, but for the second column when a 2D histogram
is requested, see @option{--histogram2d}.
+@cindex Sky
+One of the most important aspects of a dataset is its reference value: the
value of the dataset where there is no signal.
+Without knowing, and thus removing the effect of, this value it is impossible
to compare the derived results of many high-level analyses over the dataset
with other datasets (in the attempt to associate our results with the ``real''
world).
-@item --onebinstart2=FLT
-Similar to @option{--onebinstart}, but for the second column when a 2D
histogram is requested, see @option{--histogram2d}.
+In astronomy, this reference value is known as the ``Sky'' value: the value
that noise fluctuates around: where there is no signal from detectable objects
or artifacts (for example, galaxies, stars, planets or comets, star spikes or
internal optical ghost).
+Depending on the dataset, the Sky value maybe a fixed value over the whole
dataset, or it may vary based on location.
+For an example of the latter case, see Figure 11 in
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa (2015)}.
-@end table
+Because of the significance of the Sky value in astronomical data analysis, we
have devoted this subsection to it for a thorough review.
+We start with a thorough discussion on its definition (@ref{Sky value
definition}).
+In the astronomical literature, researchers use a variety of methods to
estimate the Sky value, so in @ref{Sky value misconceptions}) we review those
and discuss their biases.
+From the definition of the Sky value, the most accurate way to estimate the
Sky value is to run a detection algorithm (for example, @ref{NoiseChisel}) over
the dataset and use the undetected pixels.
+However, there is also a more crude method that maybe useful when good direct
detection is not initially possible (for example, due to too many cosmic rays
in a shallow image).
+A more crude (but simpler method) that is usable in such situations is
discussed in @ref{Quantifying signal in a tile}.
-@node Fitting options, Contour options, Generating histograms and cumulative
frequency plots, Invoking aststatistics
-@subsubsection Fitting options
+@menu
+* Sky value definition:: Definition of the Sky/reference value.
+* Sky value misconceptions:: Wrong methods to estimate the Sky value.
+* Quantifying signal in a tile:: Method to estimate the presence of signal.
+@end menu
-With the options below, you can customize the least squares fitting features
of Statistics.
-For a tutorial of the usage of least squares fitting in Statistics, please see
@ref{Least squares fitting}.
-Here, we will just review the details of each option.
+@node Sky value definition, Sky value misconceptions, Sky value, Sky value
+@subsubsection Sky value definition
-To activate least squares fitting in Statistics, it is necessary to use the
@option{--fit} option to specify the type of fit you want to do.
-See the description of @option{--fit} for the various available fitting models.
-The fitting models that account for weights require three input columns, while
the non-weighted ones only take two input columns.
-Here is a summary of the input columns:
+@cindex Sky value
+This analysis is taken from @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa (2015)}.
+Let's assume that all instrument defects -- bias, dark and flat -- have been
corrected and the magnitude (see @ref{Brightness flux magnitude}) of a detected
object, @mymath{O}, is desired.
+The sources of flux on pixel@footnote{For this analysis the dimension of the
data (image) is irrelevant.
+So if the data is an image (2D) with width of @mymath{w} pixels, then a pixel
located on column @mymath{x} and row @mymath{y} (where all counting starts from
zero and (0, 0) is located on the bottom left corner of the image), would have
an index: @mymath{i=x+y\times{}w}.} @mymath{i} of the image can be written as
follows:
-@enumerate
+@itemize
@item
-The first input column is assumed to be the independent variable (on the
horizontal axis of a plot, or @mymath{X} in the equations of each fit).
+Contribution from the target object (@mymath{O_i}).
@item
-The second input column is assumed to be the measured value (on the vertical
axis of a plot, or @mymath{Y} in the equation above).
+Contribution from other detected objects (@mymath{D_i}).
@item
-The third input column is only for fittings with a weight.
-It is assumed to be the ``weight'' of the measurement column.
-The nature of the ``weight'' can be set with the @option{--fitweight} option,
for example, if you have the standard deviation of the error in @mymath{Y}, you
can use @option{--fitweight=std} (which is the default, so unless the default
value has been changed, you will not need to set this).
-@end enumerate
-
-If three columns are given to a model without weight, or two columns are given
to a model that requires weights, Statistics will abort and inform you.
-Below you can see an example of fitting with the same linear model, once
weighted and once without weights.
-
-@example
-$ aststatistics table.fits --column=X,Y --fit=linear
-$ aststatistics table.fits --column=X,Y,Yerr --fit=linear-weighted
-@end example
+Undetected objects or the fainter undetected regions of bright objects
(@mymath{U_i}).
+@item
+@cindex Cosmic rays
+A cosmic ray (@mymath{C_i}).
+@item
+@cindex Background flux
+The background flux, which is defined to be the count if none of the others
exists on that pixel (@mymath{B_i}).
+@end itemize
+@noindent
+The total flux in this pixel (@mymath{T_i}) can thus be written as:
-The output of the fitting can be in three modes listed below.
-For a complete example, see the tutorial in @ref{Least squares fitting}).
-@table @asis
-@item Human friendly format
-By default (for example, the commands above) the output is an elaborate
description of the model parameters.
-For example, @mymath{c_0} and @mymath{c_1} in the linear model
(@mymath{Y=c_0+c_1X}).
-Their covariance matrix and the reduced @mymath{\chi^2} of the fit are also
printed on the output.
-@item Raw numbers
-If you don't need the human friendly components of the output (which are
annoying when you want to parse the outputs in some scenarios), you can use
@option{--quiet} option.
-Only the raw output numbers will be printed.
-@item Estimate on a custom X column
-Through the @option{--fitestimate} option, you can specify an independent
table column to estimate the fit (it can also take a single value).
-See the description of this option for more.
-@end table
+@dispmath{T_i=B_i+D_i+U_i+C_i+O_i.}
-@table @option
-@item -f STR
-@itemx --fit=STR
-The name of the fitting method to use.
-They are based on the @url{https://www.gnu.org/software/gsl/doc/html/lls.html,
linear} and @url{https://www.gnu.org/software/gsl/doc/html/nls.html, nonlinear}
least-squares fitting functions of the GNU Scientific Library (GSL).
-@table @code
-@item linear
-@mymath{Y=c_0+c_1X}
-@item linear-weighted
-@mymath{Y=c_0+c_1X}; accounting for ``weights'' in @mymath{Y}.
-@item linear-no-constant
-@mymath{Y=c_1X}.
-@item linear-no-constant-weighted
-@mymath{Y=c_1X}; accounting for ``weights'' in @mymath{Y}.
-@item polynomial
-@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}; the maximum required power
(@mymath{n}) is specified by @option{--fitmaxpower}.
-@item polynomial-weighted
-@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}; accounting for ``weights'' in
@mymath{Y}.
-The maximum required power (@mymath{n}) is specified by @option{--fitmaxpower}.
-@item polynomial-robust
-@cindex Robust polynomial fit
-@cindex Polynomial fit (robust)
-@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}; rejects outliers.
-The function to use for outlier removal can be specified with the
@option{--fitrobust} option described below.
-This model doesn't take weights since they are calculated internally based on
the outlier removal function (requires two input columns).
-The maximum required power (@mymath{n}) is specified by @option{--fitmaxpower}.
+@cindex Cosmic ray removal
+@noindent
+By definition, @mymath{D_i} is detected and it can be assumed that it is
correctly estimated (deblended) and subtracted, we can thus set @mymath{D_i=0}.
+There are also methods to detect and remove cosmic rays, for example, the
method described in van Dokkum (2001)@footnote{van Dokkum, P. G. (2001).
+Publications of the Astronomical Society of the Pacific. 113, 1420.}, or by
comparing multiple exposures.
+This allows us to set @mymath{C_i=0}.
+Note that in practice, @mymath{D_i} and @mymath{U_i} are correlated, because
they both directly depend on the detection algorithm and its input parameters.
+Also note that no detection or cosmic ray removal algorithm is perfect.
+With these limitations in mind, the observed Sky value for this pixel
(@mymath{S_i}) can be defined as
-For a comprehensive review of ``robust'' fitting and the available functions,
please see the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#robust-linear-regression,
Robust linear regression} section of the GNU Scientific Library.
-@end table
+@cindex Sky value
+@dispmath{S_i\equiv{}B_i+U_i.}
-@item --fitweight=STR
-The nature of the ``weight'' column (when a weight is necessary for the model).
-It can take one of the following values:
-@table @code
-@item std
-Standard deviation of each @mymath{Y} axis measurement: this is the usual
``error'' associated with a measurement (for example, in @ref{MakeCatalog}) and
is the default value to this option.
-@item var
-Variance of each @mymath{Y} axis measurement.
-Assuming a Gaussian distribution with standard deviation @mymath{\sigma}, the
variance is @mymath{\sigma^2}.
-@item inv-var
-Inverse variance of each @mymath{Y} axis measurement.
-Assuming a Gaussian distribution with standard deviation @mymath{\sigma}, the
variance is @mymath{1/\sigma^2}.
-@end table
+@noindent
+Therefore, as the detection process (algorithm and input parameters) becomes
more accurate, or @mymath{U_i\to0}, the Sky value will tend to the background
value or @mymath{S_i\to B_i}.
+Hence, we see that while @mymath{B_i} is an inherent property of the data
(pixel in an image), @mymath{S_i} depends on the detection process.
+Over a group of pixels, for example, in an image or part of an image, this
equation translates to the average of undetected pixels
(Sky@mymath{=\sum{S_i}}).
+With this definition of Sky, the object flux in the data can be calculated,
per pixel, with
-@item --fitmaxpower=INT
-The maximum power (an integer) in a polynomial (@mymath{n} in
@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}).
-This is only relevant when one of the polynomial models is given to
@option{--fit}.
-The fit will return @mymath{n+1} coefficients.
+@dispmath{ T_{i}=S_{i}+O_{i} \quad\rightarrow\quad
+ O_{i}=T_{i}-S_{i}.}
-@item --fitrobust=STR
-The function for rejecting outliers in the @code{polynomial-robust} fitting
model.
-For a comprehensive review of ``robust'' fitting and the available functions,
please see the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#robust-linear-regression,
Robust linear regression} section of the GNU Scientific Library.
-This function can take the following values:
-@table @code
-@item bisquare
-@cindex Tukey’s biweight (bisquare) function
-@cindex Biweight function of Tukey
-@cindex Bisquare function of Tukey
-Tukey’s biweight (bisquare) function, this is the default function.
-According to the GSL manual, this is a good general purpose weight function.
-@item cauchy
-@cindex Cauchy's function (robust weight)
-@cindex Lorentzian function (robust weight)
-Cauchy’s function (also known as the Lorentzian function).
-It doesn't guarantee a unique solution, so it should be used with care.
-@item fair
-@cindex Fair function (robust weight)
-The fair function.
-It guarantees a unique solution and has continuous derivatives to three orders.
-@item huber
-@cindex Huber function (robust weight)
-Huber's @mymath{\rho} function.
-This is also a good general purpose weight function for rejecting outliers,
but can cause difficulty in some special scenarios.
-@item ols
-Ordinary Least Squares (OLS) solution with a constant weight of unity.
-@item welsch
-@cindex Welsch function (robust weight)
-Welsch function which is useful when the residuals follow an exponential
distribution.
-@end table
+@cindex photo-electrons
+In the fainter outskirts of an object, a very small fraction of the
photo-electrons in a pixel actually belongs to objects, the rest is caused by
random factors (noise), see Figure 1b in @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa (2015)}.
+Therefore even a small over estimation of the Sky value will result in the
loss of a very large portion of most galaxies.
+Besides the lost area/brightness, this will also cause an over-estimation of
the Sky value and thus even more under-estimation of the object's magnitude.
+It is thus very important to detect the diffuse flux of a target, even if they
are not your primary target.
+
+In summary, the more accurately the Sky is measured, the more accurately the
magnitude (calculated from the sum of pixel values) of the target object can be
measured (photometry).
+Any under/over-estimation in the Sky will directly translate to an
over/under-estimation of the measured object's magnitude.
+
+@cartouche
+@noindent
+The @strong{Sky value} is only correctly found when all the detected
+objects (@mymath{D_i} and @mymath{C_i}) have been removed from the data.
+@end cartouche
+
+
+
+
+@node Sky value misconceptions, Quantifying signal in a tile, Sky value
definition, Sky value
+@subsubsection Sky value misconceptions
+
+As defined in @ref{Sky value}, the sky value is only accurately defined when
the detection algorithm is not significantly reliant on the sky value.
+In particular its detection threshold.
+However, most signal-based detection tools@footnote{According to Akhlaghi and
Ichikawa (2015), signal-based detection is a detection process that relies
heavily on assumptions about the to-be-detected objects.
+This method was the most heavily used technique prior to the introduction of
NoiseChisel in that paper.} use the sky value as a reference to define the
detection threshold.
+These older techniques therefore had to rely on approximations based on other
assumptions about the data.
+A review of those other techniques can be seen in Appendix A of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa (2015)}.
+
+These methods were extensively used in astronomical data analysis for several
decades, therefore they have given rise to a lot of misconceptions, ambiguities
and disagreements about the sky value and how to measure it.
+As a summary, the major methods used until now were an approximation of the
mode of the image pixel distribution and @mymath{\sigma}-clipping.
-@item --fitestimate=STR/FLT
-Estimate the fitted function at a single point or a complete column of points.
-The input @mymath{X} axis positions to estimate the function can be specified
in the following ways:
@itemize
+@cindex Histogram
+@cindex Distribution mode
+@cindex Mode of a distribution
+@cindex Probability density function
@item
-A real number: the fitted function will be estimated at that @mymath{X}
position and the corresponding @mymath{Y} and its error will be printed to
standard output.
-@item
-@code{self}: in this mode, the same X axis column that was used in the fit
will be used for estimating the fitted function.
-This can be useful to visually/easily check the fit, see @ref{Least squares
fitting}.
+To find the mode of a distribution those methods would either have to assume
(or find) a certain probability density function (PDF) or use the histogram.
+But astronomical datasets can have any distribution, making it almost
impossible to define a generic function.
+Also, histogram-based results are very inaccurate (there is a large
dispersion) and it depends on the histogram bin-widths.
+Generally, the mode of a distribution also shifts as signal is added.
+Therefore, even if it is accurately measured, the mode is a biased measure for
the Sky value.
+
+@cindex Sigma-clipping
@item
-A file name: If the value is none of the above, Statistics expects it to be a
file name containing a table.
-If the file is a FITS file, the HDU containing the table should be specified
with the @option{--fitestimatehdu} option.
-The column of the table to use for the @mymath{X} axis points should be
specified with the @option{--fitestimatecol} option.
+Another approach was to iteratively clip the brightest pixels in the image
(which is known as @mymath{\sigma}-clipping).
+See @ref{Sigma clipping} for a complete explanation.
+@mymath{\sigma}-clipping is useful when there are clear outliers (an object
with a sharp edge in an image for example).
+However, real astronomical objects have diffuse and faint wings that penetrate
deeply into the noise, see Figure 1 in @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa (2015)}.
@end itemize
-The output in this mode can be customized in the following ways:
-@itemize
-@item
-If a single floating point value is given @option{--fitestimate}, the fitted
function will be estimated on that point and printed to standard output.
-@item
-When nothing is given to @option{--output}, the independent column and the
estimated values and errors are printed on the standard output.
-@item
-If a file name is given to @option{--output}, the estimated table above is
saved in that file.
-It can have any of the formats in @ref{Recognized table formats}.
-As a FITS file, all the fit outputs (coefficients, covariance matrix and
reduced @mymath{\chi^2}) are kept as FITS keywords in the same HDU of the
estimated table.
-For a complete example, see @ref{Least squares fitting}.
-When the covariance matrix (and thus the @mymath{\chi^2}) cannot be calculated
(for example if you only have two rows!), the printed values on the terminal
will be NaN.
-However, the FITS standard does not allow NaN values in keyword values!
-Therefore, when writing the @mymath{\chi^2} and covariance matrix elements
into the output FITS keywords, the largest value of the 64-bit floating point
type will be written: @mymath{1.79769313486232\times10^{308}}; see @ref{Numeric
data types}.
+As discussed in @ref{Sky value}, the sky value can only be correctly defined
as the average of undetected pixels.
+Therefore all such approaches that try to approximate the sky value prior to
detection are ultimately poor approximations.
-@item
-When @option{--quiet} is given with @option{--fitestimate}, the fitted
parameters are no longer printed on the standard output; they are available as
FITS keywords in the file given to @option{--output}.
-@end itemize
-@item --fitestimatehdu=STR/INT
-HDU name or counter (counting from zero) that contains the table to be used
for the estimating the fitted function over many points through
@option{--fitestimate}.
-For more on selecting a HDU, see the description of @option{--hdu} in
@ref{Input output options}.
-@item --fitestimatecol=STR/INT
-Column name or counter (counting from one) that contains the table to be used
for the estimating the fitted function over many points through
@option{--fitestimate}.
-See @ref{Selecting table columns}.
-@end table
+@node Quantifying signal in a tile, , Sky value misconceptions, Sky value
+@subsubsection Quantifying signal in a tile
+In order to define detection thresholds on the image, or calibrate it for
measurements (subtract the signal of the background sky and define errors), we
need some basic measurements.
+For example, the quantile threshold in NoiseChisel (@option{--qthresh}
option), or the mean of the undetected regions (Sky) and the Sky standard
deviation (Sky STD) which are the output of NoiseChisel and Statistics.
+But astronomical images will contain a lot of stars and galaxies that will
bias those measurements if not properly accounted for.
+Quantifying where signal is present is thus a very important step in the usage
of a dataset; for example, if the Sky level is over-estimated, your target
object's magnitude will be under-estimated.
+@cindex Data
+@cindex Noise
+@cindex Signal
+@cindex Gaussian distribution
+Let's start by clarifying some definitions:
+@emph{Signal} is defined as the non-random source of flux in each pixel (you
can think of this as the mean in a Gaussian or Poisson distribution).
+In astronomical images, signal is mostly photons coming of a star or galaxy,
and counted in each pixel.
+@emph{Noise} is defined as the random source of flux in each pixel (or the
standard deviation of a Gaussian or Poisson distribution).
+Noise is mainly due to counting errors in the detector electronics upon data
collection.
+@emph{Data} is defined as the combination of signal and noise (so a noisy
image of a galaxy is one @emph{data}set).
+When a dataset does not have any signal (for example, you take an image with a
closed shutter, producing an image that only contains noise), the mean, median
and mode of the distribution are equal within statistical errors.
+Signal from emitting objects, like astronomical targets, always has a positive
value and will never become negative, see Figure 1 in
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa (2015)}.
+Therefore, when signal is added to the data (you take an image with an open
shutter pointing to a galaxy for example), the mean, median and mode of the
dataset shift to the positive, creating a positively skewed distribution.
+The shift of the mean is the largest.
+The median shifts less, since it is defined after ordering all the
elements/pixels (the median is the value at a quantile of 0.5), thus it is not
affected by outliers.
+Finally, the mode's shift to the positive is the least.
+@cindex Mean
+@cindex Median
+@cindex Quantile
+Inverting the argument above gives us a robust method to quantify the
significance of signal in a dataset: when the mean and median of a distribution
are approximately equal we can argue that there is no significant signal.
+In other words: when the quantile of the mean (@mymath{q_{mean}}) is around
0.5.
+This definition of skewness through the quantile of the mean is further
introduced with a real image the tutorials, see @ref{Skewness caused by signal
and its measurement}.
-@node Contour options, Statistics on tiles, Fitting options, Invoking
aststatistics
-@subsubsection Contour options
+@cindex Signal-to-noise ratio
+However, in an astronomical image, some of the pixels will contain more signal
than the rest, so we cannot simply check @mymath{q_{mean}} on the whole dataset.
+For example, if we only look at the patch of pixels that are placed under the
central parts of the brightest stars in the field of view, @mymath{q_{mean}}
will be very high.
+The signal in other parts of the image will be weaker, and in some parts it
will be much smaller than the noise (for example, 1/100-th of the noise level).
+When the signal-to-noise ratio is very small, we can generally assume no
signal (because its effectively impossible to measure it) and @mymath{q_{mean}}
will be approximately 0.5.
-Contours are useful to highlight the 2D shape of a certain flux level over an
image.
-To derive contours in Statistics, you can use the option below:
+To address this problem, we break the image into a grid of tiles@footnote{The
options to customize the tessellation are discussed in @ref{Processing
options}.} (see @ref{Tessellation}).
+For example, a tile can be a square box of size @mymath{30\times30} pixels.
+By measuring @mymath{q_{mean}} on each tile, we can find which tiles that
contain significant signal and ignore them.
+Technically, if a tile's @mymath{|q_{mean}-0.5|} is larger than the value
given to the @option{--meanmedqdiff} option, that tile will be ignored for the
next steps.
+You can read this option as ``mean-median-quantile-difference''.
-@table @option
+@cindex Skewness
+@cindex Convolution
+The raw dataset's pixel distribution (in each tile) is noisy, to decrease the
noise/error in estimating @mymath{q_{mean}}, we convolve the image before
tessellation (see @ref{Convolution process}.
+Convolution decreases the range of the dataset and enhances its skewness, See
Section 3.1.1 and Figure 4 in @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa (2015)}.
+This enhanced skewness can be interpreted as an increase in the Signal to
noise ratio of the objects buried in the noise.
+Therefore, to obtain an even better measure of the presence of signal in a
tile, the mean and median discussed above are measured on the convolved image.
-@item -R FLT[,FLT[,FLT...]]
-@itemx --contour=FLT[,FLT[,FLT...]]
-@cindex Contour
-@cindex Plot: contour
-Write the contours for the requested levels in a file ending with
@file{_contour.txt}.
-It will have three columns: the first two are the coordinates of each point
and the third is the level it belongs to (one of the input values).
-Each disconnected contour region will be separated by a blank line.
-This is the requested format for adding contours with PGFPlots in @LaTeX{}.
-If any other format can be useful for your work please let us know so we can
add it.
-If the image has World Coordinate System information, the written coordinates
will be in RA and Dec, otherwise, they will be in pixel coordinates.
+@cindex Cosmic rays
+There is one final hurdle: raw astronomical datasets are commonly peppered
with Cosmic rays.
+Images of Cosmic rays are not smoothed by the atmosphere or telescope
aperture, so they have sharp boundaries.
+Also, since they do not occupy too many pixels, they do not affect the mode
and median calculation.
+But their very high values can greatly bias the calculation of the mean
(recall how the mean shifts the fastest in the presence of outliers), for
example, see Figure 15 in @url{https://arxiv.org/abs/1505.01664, Akhlaghi and
Ichikawa (2015)}.
+The effect of outliers like cosmic rays on the mean and standard deviation can
be removed through @mymath{\sigma}-clipping, see @ref{Sigma clipping} for a
complete explanation.
-Note that currently, this is a very crude/simple implementation, please let us
know if you find problematic situations so we can fix it.
-@end table
+Therefore, after asserting that the mean and median are approximately equal in
a tile (see @ref{Tessellation}), the Sky and its STD are measured on each tile
after @mymath{\sigma}-clipping with the @option{--sigmaclip} option (see
@ref{Sigma clipping}).
+In the end, some of the tiles will pass the test and will be given a value.
+Others (that had signal in them) will just be assigned a NaN (not-a-number)
value.
+But we need a measurement over each tile (and thus pixel).
+We will therefore use interpolation to assign a value to the NaN tiles.
-@node Statistics on tiles, , Contour options, Invoking aststatistics
-@subsubsection Statistics on tiles
+However, prior to interpolating over the failed tiles, another point should be
considered: large and extended galaxies, or bright stars, have wings which sink
into the noise very gradually.
+In some cases, the gradient over these wings can be on scales that is larger
than the tiles (for example, the pixel value changes by @mymath{0.1\sigma}
after 100 pixels, but the tile has a width of 30 pixels).
-All the options described until now were from the first class of operations
discussed above: those that treat the whole dataset as one.
-However, it often happens that the relative position of the dataset elements
over the dataset is significant.
-For example, you do not want one median value for the whole input image, you
want to know how the median changes over the image.
-For such operations, the input has to be tessellated (see @ref{Tessellation}).
-Thus this class of options cannot currently be called along with the options
above in one run of Statistics.
+In such cases, the @mymath{q_{mean}} test will be successful, even though
there is signal.
+Recall that @mymath{q_{mean}} is a measure of skewness.
+If we do not identify (and thus set to NaN) such outlier tiles before the
interpolation, the photons of the outskirts of the objects will leak into the
detection thresholds or Sky and Sky STD measurements and bias our result, see
@ref{Detecting large extended targets}.
+Therefore, the final step of ``quantifying signal in a tile'' is to look at
this distribution of successful tiles and remove the outliers.
+@mymath{\sigma}-clipping is a good solution for removing a few outliers, but
the problem with outliers of this kind is that there may be many such tiles
(depending on the large/bright stars/galaxies in the image).
+We therefore apply the following local outlier rejection strategy.
-@table @option
+For each tile, we find the nearest @mymath{N_{ngb}} tiles that had a usable
value (@mymath{N_{ngb}} is the value given to @option{--outliernumngb}).
+We then sort them and find the difference between the largest and
second-to-smallest elements (The minimum is not used because the scatter can be
large).
+Let's call this the tile's @emph{slope} (measured from its neighbors).
+All the tiles that are on a region of flat noise will have similar slope
values, but if a few tiles fall on the wings of a bright star or large galaxy,
their slope will be significantly larger than the tiles with no signal.
+We just have to find the smallest tile slope value that is an outlier compared
to the rest, and reject all tiles with a slope larger than that.
-@item -t
-@itemx --ontile
-Do the respective single-valued calculation over one tile of the input
dataset, not the whole dataset.
-This option must be called with at least one of the single valued options
discussed above (for example, @option{--mean} or @option{--quantile}).
-The output will be a file in the same format as the input.
-If the @option{--oneelempertile} option is called, then one element/pixel will
be used for each tile (see @ref{Processing options}).
-Otherwise, the output will have the same size as the input, but each element
will have the value corresponding to that tile's value.
-If multiple single valued operations are called, then for each operation there
will be one extension in the output FITS file.
+@cindex Outliers
+@cindex Identifying outliers
+To identify the smallest outlier, we will use the distribution of distances
between sorted elements.
+Let's assume the total number of tiles with a good mean-median quantile
difference is @mymath{N}.
+They are first sorted and searching for the outlier starts on element
@mymath{N/3} (integer division).
+Let's take @mymath{v_i} to be the @mymath{i}-th element of the sorted input
(with no blank values) and @mymath{m} and @mymath{\sigma} as the
@mymath{\sigma}-clipped median and standard deviation from the distances of the
previous @mymath{N/3-1} elements (not including @mymath{v_i}).
+If the value given to @option{--outliersigma} is displayed with @mymath{s},
the @mymath{i}-th element is considered as an outlier when the condition below
is true.
-@item -y
-@itemx --sky
-Estimate the Sky value on each tile as fully described in @ref{Quantifying
signal in a tile}.
-As described in that section, several options are necessary to configure the
Sky estimation which are listed below.
-The output file will have two extensions: the first is the Sky value and the
second is the Sky standard deviation on each tile.
-Similar to @option{--ontile}, if the @option{--oneelempertile} option is
called, then one element/pixel will be used for each tile (see @ref{Processing
options}).
+@dispmath{{(v_i-v_{i-1})-m\over \sigma}>s}
-@end table
+@noindent
+Since @mymath{i} begins from the @mymath{N/3}-th element in the sorted array
(a quantile of @mymath{1/3=0.33}), the outlier has to be larger than the
@mymath{0.33} quantile value of the dataset (this is usually the case;
otherwise, it is hard to define it as an ``outlier''!).
-The parameters for estimating the sky value can be set with the following
options, except for the @option{--sclipparams} option (which is also used by
the @option{--sigmaclip}), the rest are only used for the Sky value estimation.
+@cindex Bicubic interpolation
+@cindex Interpolation, bicubic
+@cindex Nearest-neighbor interpolation
+@cindex Interpolation, nearest-neighbor
+Once the outlying tiles have been successfully identified and set to NaN, we
use nearest-neighbor interpolation to give a value to all tiles in the image.
+We do not use parametric interpolation methods (like bicubic), because they
will effectively extrapolate on the edges, creating strong artifacts.
+Nearest-neighbor interpolation is very simple: for each tile, we find the
@mymath{N_{ngb}} nearest tiles that had a good value, the tile's value is found
by estimating the median.
+You can set @mymath{N_{ngb}} through the @option{--interpnumngb} option.
+Once all the tiles are given a value, a smoothing step is implemented to
remove the sharp value contrast that can happen on the edges of tiles.
+The size of the smoothing box is set with the @option{--smoothwidth} option.
-@table @option
+As mentioned above, the process above is used for any of the basic
measurements (for example, identifying the quantile-based thresholds in
NoiseChisel, or the Sky value in Statistics).
+You can use the check-image feature of NoiseChisel or Statistics to inspect
the steps and visually see each step (all the options that start with
@option{--check}).
+For example, as mentioned in the @ref{NoiseChisel optimization} tutorial, when
given a dataset from a new instrument (with differing noise properties), we
highly recommend to use @option{--checkqthresh} in your first call and visually
inspect how the parameters above affect the final quantile threshold (e.g.,
have the wings of bright sources leaked into the threshold?).
+The same goes for the @option{--checksky} option of Statistics or NoiseChisel.
-@item -k=FITS
-@itemx --kernel=FITS
-File name of kernel to help in estimating the significance of signal in a
-tile, see @ref{Quantifying signal in a tile}.
-@item --khdu=STR
-Kernel HDU to help in estimating the significance of signal in a tile, see
-@ref{Quantifying signal in a tile}.
-@item --meanmedqdiff=FLT
-The maximum acceptable distance between the quantiles of the mean and median,
see @ref{Quantifying signal in a tile}.
-The initial Sky and its standard deviation estimates are measured on tiles
where the quantiles of their mean and median are less distant than the value
given to this option.
-For example, @option{--meanmedqdiff=0.01} means that only tiles where the
mean's quantile is between 0.49 and 0.51 (recall that the median's quantile is
0.5) will be used.
-@item --sclipparams=FLT,FLT
-The @mymath{\sigma}-clipping parameters, see @ref{Sigma clipping}.
-This option takes two values which are separated by a comma (@key{,}).
-Each value can either be written as a single number or as a fraction of two
numbers (for example, @code{3,1/10}).
-The first value to this option is the multiple of @mymath{\sigma} that will be
clipped (@mymath{\alpha} in that section).
-The second value is the exit criteria.
-If it is less than 1, then it is interpreted as tolerance and if it is larger
than one it is a specific number.
-Hence, in the latter case the value must be an integer.
-@item --outliersclip=FLT,FLT
-@mymath{\sigma}-clipping parameters for the outlier rejection of the Sky
-value (similar to @option{--sclipparams}).
-Outlier rejection is useful when the dataset contains a large and diffuse
(almost flat within each tile) signal.
-The flatness of the profile will cause it to successfully pass the mean-median
quantile difference test, so we will need to use the distribution of successful
tiles for removing these false positive.
-For more, see the latter half of @ref{Quantifying signal in a tile}.
-@item --outliernumngb=INT
-Number of neighboring tiles to use for outlier rejection (mostly the wings of
bright stars or galaxies).
-If this option is given a value of zero, no outlier rejection will take place.
-For more see the latter half of @ref{Quantifying signal in a tile}.
-@item --outliersigma=FLT
-Multiple of sigma to define an outlier in the Sky value estimation.
-If this option is given a value of zero, no outlier rejection will take place.
-For more see @option{--outliersclip} and the latter half of @ref{Quantifying
signal in a tile}.
-@item --smoothwidth=INT
-Width of a flat kernel to convolve the interpolated tile values.
-Tile interpolation is done using the median of the @option{--interpnumngb}
neighbors of each tile (see @ref{Processing options}).
-If this option is given a value of zero or one, no smoothing will be done.
-Without smoothing, strong boundaries will probably be created between the
values estimated for each tile.
-It is thus good to smooth the interpolated image so strong discontinuities do
not show up in the final Sky values.
-The smoothing is done through convolution (see @ref{Convolution process}) with
a flat kernel, so the value to this option must be an odd number.
-@item --ignoreblankintiles
-Do Not set the input's blank pixels to blank in the tiled outputs (for
example, Sky and Sky standard deviation extensions of the output).
-This is only applicable when the tiled output has the same size as the input,
in other words, when @option{--oneelempertile} is not called.
+@node Invoking aststatistics, , Sky value, Statistics
+@subsection Invoking Statistics
+
+Statistics will print statistical measures of an input dataset (table column
or image).
+The executable name is @file{aststatistics} with the following general template
+
+@example
+$ aststatistics [OPTION ...] InputImage.fits
+@end example
+
+@noindent
+One line examples:
+
+@example
+## Print some general statistics of input image:
+$ aststatistics image.fits
+
+## Print some general statistics of column named MAG_F160W:
+$ aststatistics catalog.fits -h1 --column=MAG_F160W
+
+## Make the histogram of the column named MAG_F160W:
+$ aststatistics table.fits -cMAG_F160W --histogram
+
+## Find the Sky value on image with a given kernel:
+$ aststatistics image.fits --sky --kernel=kernel.fits
+
+## Print Sigma-clipped results of records with a MAG_F160W
+## column value between 26 and 27:
+$ aststatistics cat.fits -cMAG_F160W -g26 -l27 --sigmaclip=3,0.2
+
+## Find the polynomial (to third order) that best fits the X and Y
+## columns of 'table.fits'. Robust fitting will be used to reject
+## outliers. Also, estimate the fitted polynomial on the same input
+## column (with errors).
+$ aststatistics table.fits --fit=polynomial-robust --fitmaxpower=3 \
+ -cX,Y --fitestimate=self --output=estimated.fits
+
+## Print the median value of all records in column MAG_F160W that
+## have a value larger than 3 in column PHOTO_Z:
+$ aststatistics tab.txt -rPHOTO_Z -g3 -cMAG_F160W --median
+
+## Calculate the median of the third column in the input table, but only
+## for rows where the mean of the first and second columns is >5.
+$ awk '($1+$2)/2 > 5 @{print $3@}' table.txt | aststatistics --median
+@end example
-By default, blank values in the input (commonly on the edges which are outside
the survey/field area) will be set to blank in the tiled outputs also.
-But in other scenarios this default behavior is not desired; for example, if
you have masked something in the input, but want the tiled output under that
also.
+@noindent
+@cindex Standard input
+Statistics can take its input dataset either from a file (image or table) or
the Standard input (see @ref{Standard input}).
+If any output file is to be created, the value to the @option{--output}
option, is used as the base name for the generated files.
+Without @option{--output}, the input name will be used to generate an output
name, see @ref{Automatic output}.
+The options described below are particular to Statistics, but for general
operations, it shares a large collection of options with the other Gnuastro
programs, see @ref{Common options} for the full list.
+For more on reading from standard input, please see the description of
@code{--stdintimeout} option in @ref{Input output options}.
+Options can also be given in configuration files, for more, please see
@ref{Configuration files}.
-@item --checksky
-Create a multi-extension FITS file showing the steps that were used to
estimate the Sky value over the input, see @ref{Quantifying signal in a tile}.
-The file will have two extensions for each step (one for the Sky and one for
the Sky standard deviation).
+The input dataset may have blank values (see @ref{Blank pixels}), in this
case, all blank pixels are ignored during the calculation.
+Initially, the full dataset will be read, but it is possible to select a
specific range of data elements to use in the analysis of each run.
+You can either directly specify a minimum and maximum value for the range of
data elements to use (with @option{--greaterequal} or @option{--lessthan}), or
specify the range using quantiles (with @option{--qrange}).
+If a range is specified, all pixels outside of it are ignored before any
processing.
-@end table
+@cindex ASCII plot
+When no operation is requested, Statistics will print some general basic
properties of the input dataset on the command-line like the example below (ran
on one of the output images of @command{make check}@footnote{You can try it by
running the command in the @file{tests} directory, open the image with a FITS
viewer and have a look at it to get a sense of how these statistics relate to
the input image/dataset.}).
+This default behavior is designed to help give you a general feeling of how
the data are distributed and help in narrowing down your analysis.
-@node NoiseChisel, Segment, Statistics, Data analysis
-@section NoiseChisel
+@example
+$ aststatistics convolve_spatial_scaled_noised.fits \
+ --greaterequal=9500 --lessthan=11000
+Statistics (GNU Astronomy Utilities) X.X
+-------
+Input: convolve_spatial_scaled_noised.fits (hdu: 0)
+Range: from (inclusive) 9500, upto (exclusive) 11000.
+Unit: counts
+-------
+ Number of elements: 9074
+ Minimum: 9622.35
+ Maximum: 10999.7
+ Mode: 10055.45996
+ Mode quantile: 0.4001983908
+ Median: 10093.7
+ Mean: 10143.98257
+ Standard deviation: 221.80834
+-------
+Histogram:
+ | **
+ | ******
+ | *******
+ | *********
+ | *************
+ | **************
+ | ******************
+ | ********************
+ | *************************** *
+ | ***************************************** ***
+ |* **************************************************************
+ |-----------------------------------------------------------------
+@end example
-@cindex Labeling
-@cindex Detection
-@cindex Segmentation
-Once instrumental signatures are removed from the raw data (image) in the
initial reduction process (see @ref{Data manipulation}).
-You are naturally eager to start answering the scientific questions that
motivated the data collection in the first place.
-However, the raw dataset/image is just an array of values/pixels, that is all!
These raw values cannot directly be used to answer your scientific questions;
for example, ``how many galaxies are there in the image?'' and ``What is their
magnitude?''.
+Gnuastro's Statistics is a very general purpose program, so to be able to
easily understand this diversity in its operations (and how to possibly run
them together), we will divided the operations into two types: those that do
not respect the position of the elements and those that do (by tessellating the
input on a tile grid, see @ref{Tessellation}).
+The former treat the whole dataset as one and can re-arrange all the elements
(for example, sort them), but the former do their processing on each tile
independently.
+First, we will review the operations that work on the whole dataset.
-The first high-level step your analysis will therefore be to classify, or
label, the dataset elements (pixels) into two classes:
-1) Noise, where random effects are the major contributor to the value, and
-2) Signal, where non-random factors (for example, light from a distant galaxy)
are present.
-This classification of the elements in a dataset is formally known as
@emph{detection}.
+@cindex AWK
+@cindex GNU AWK
+The group of options below can be used to get single value measurement(s) of
the whole dataset.
+They will print only the requested value as one field in a line/row, like the
@option{--mean}, @option{--median} options.
+These options can be called any number of times and in any order.
+The outputs of all such options will be printed on one line following each
other (with a space character between them).
+This feature makes these options very useful in scripts, or to redirect into
programs like GNU AWK for higher-level processing.
+These are some of the most basic measures, Gnuastro is still under heavy
development and this list will grow.
+If you want another statistical parameter, please contact us and we will do
out best to add it to this list, see @ref{Suggest new feature}.
-In an observational/experimental dataset, signal is always buried in noise:
only mock/simulated datasets are free of noise.
-Therefore detection, or the process of separating signal from noise,
determines the number of objects you study and the accuracy of any higher-level
measurement you do on them.
-Detection is thus the most important step of any analysis and is not trivial.
-In particular, the most scientifically interesting astronomical targets are
faint, can have a large variety of morphologies, along with a large
distribution in magnitude and size.
-Therefore when noise is significant, proper detection of your targets is a
uniquely decisive step in your final scientific analysis/result.
+@menu
+* Input to Statistics:: How to specify the inputs to Statistics.
+* Single value measurements:: Can be used together (like --mean, or
--maximum).
+* Generating histograms and cumulative frequency plots:: Histogram and CFP
tables.
+* Fitting options:: Least squares fitting.
+* Contour options:: Table of contours.
+* Statistics on tiles:: Possible to do single-valued measurements on
tiles.
+@end menu
-@cindex Erosion
-NoiseChisel is Gnuastro's program for detection of targets that do not have a
sharp border (almost all astronomical objects).
-When the targets have sharp edges/borders (for example, cells in biological
imaging), a simple threshold is enough to separate them from noise and each
other (if they are not touching).
-To detect such sharp-edged targets, you can use Gnuastro's Arithmetic program
in a command like below (assuming the threshold is @code{100}, see
@ref{Arithmetic}):
+@node Input to Statistics, Single value measurements, Invoking aststatistics,
Invoking aststatistics
+@subsubsection Input to Statistics
-@example
-$ astarithmetic in.fits 100 gt 2 connected-components
-@end example
+The following set of options are for specifying the input/outputs of
Statistics.
+There are many other input/output options that are common to all Gnuastro
programs including Statistics, see @ref{Input output options} for those.
-Since almost no astronomical target has such sharp edges, we need a more
advanced detection methodology.
-NoiseChisel uses a new noise-based paradigm for detection of very extended and
diffuse targets that are drowned deeply in the ocean of noise.
-It was initially introduced in @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa [2015]} and improvements after the first four were published in
@url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}.
-Please take the time to go through these papers to most effectively understand
the need of NoiseChisel and how best to use it.
+@table @option
-The name of NoiseChisel is derived from the first thing it does after
thresholding the dataset: to erode it.
-In mathematical morphology, erosion on pixels can be pictured as carving-off
boundary pixels.
-Hence, what NoiseChisel does is similar to what a wood chisel or stone chisel
do.
-It is just not a hardware, but a software.
-In fact, looking at it as a chisel and your dataset as a solid cube of rock
will greatly help in effectively understanding and optimally using it: with
NoiseChisel you literally carve your targets out of the noise.
-Try running it with the @option{--checkdetection} option, and open the
temporary output as a multi-extension cube, to see each step of the carving
process on your input dataset (see @ref{Viewing FITS file contents with DS9 or
TOPCAT}).
+@item -c STR/INT
+@itemx --column=STR/INT
+The column to use when the input file is a table with more than one column.
+See @ref{Selecting table columns} for a full description of how to use this
option.
+For more on how tables are read in Gnuastro, please see @ref{Tables}.
-@cindex Segmentation
-NoiseChisel's primary output is a binary detection map with the same size as
the input but its pixels only have two values: 0 (background) and 1
(foreground).
-Pixels that do not harbor any detected signal (noise) are given a label (or
value) of zero and those with a value of 1 have been identified as hosting
signal.
+@item -g FLT
+@itemx --greaterequal=FLT
+Limit the range of inputs into those with values greater and equal to what is
given to this option.
+None of the values below this value will be used in any of the processing
steps below.
-Segmentation is the process of classifying the signal into higher-level
constructs.
-For example, if you have two separate galaxies in one image, NoiseChisel will
give a value of 1 to the pixels of both (each forming an ``island'' of touching
foreground pixels).
-After segmentation, the connected foreground pixels will get separate labels,
enabling you to study them individually.
-NoiseChisel is only focused on detection (separating signal from noise), to
@emph{segment} the signal (into separate galaxies for example), Gnuastro has a
separate specialized program @ref{Segment}.
-NoiseChisel's output can be directly/readily fed into Segment.
+@item -l FLT
+@itemx --lessthan=FLT
+Limit the range of inputs into those with values less-than what is given to
this option.
+None of the values greater or equal to this value will be used in any of the
processing steps below.
-For more on NoiseChisel's output format and its benefits (especially in
conjunction with @ref{Segment} and later @ref{MakeCatalog}), please see
@url{https://arxiv.org/abs/1611.06387, Akhlaghi [2016]}.
-Just note that when that paper was published, Segment was not yet spun-off
into a separate program, and NoiseChisel done both detection and segmentation.
+@item -Q FLT[,FLT]
+@itemx --qrange=FLT[,FLT]
+Specify the range of usable inputs using the quantile.
+This option can take one or two quantiles to specify the range.
+When only one number is input (let's call it @mymath{Q}), the range will be
those values in the quantile range @mymath{Q} to @mymath{1-Q}.
+So when only one value is given, it must be less than 0.5.
+When two values are given, the first is used as the lower quantile range and
the second is used as the larger quantile range.
-NoiseChisel's output is designed to be generic enough to be easily used in any
higher-level analysis.
-If your targets are not touching after running NoiseChisel and you are not
interested in their sub-structure, you do not need the Segment program at all.
-You can ask NoiseChisel to find the connected pixels in the output with the
@option{--label} option.
-In this case, the output will not be a binary image any more, the signal will
have counters/labels starting from 1 for each connected group of pixels.
-You can then directly feed NoiseChisel's output into MakeCatalog for
measurements over the detections and the production of a catalog (see
@ref{MakeCatalog}).
+@cindex Quantile
+The quantile of a given element in a dataset is defined by the fraction of its
index to the total number of values in the sorted input array.
+So the smallest and largest values in the dataset have a quantile of 0.0 and
1.0.
+The quantile is a very useful non-parametric (making no assumptions about the
input) relative measure to specify a range.
+It can best be understood in terms of the cumulative frequency plot, see
@ref{Histogram and Cumulative Frequency Plot}.
+The quantile of each horizontal axis value in the cumulative frequency plot is
the vertical axis value associate with it.
-Thanks to the published papers mentioned above, there is no need to provide a
more complete introduction to NoiseChisel in this book.
-However, published papers cannot be updated any more, but the software has
evolved/changed.
-The changes since publication are documented in @ref{NoiseChisel changes after
publication}.
-In @ref{Invoking astnoisechisel}, the details of running NoiseChisel and its
options are discussed.
+@end table
-As discussed above, detection is one of the most important steps for your
scientific result.
-It is therefore very important to obtain a good understanding of NoiseChisel
(and afterwards @ref{Segment} and @ref{MakeCatalog}).
-We strongly recommend reviewing two tutorials of @ref{General program usage
tutorial} and @ref{Detecting large extended targets}.
-They are designed to show how to most effectively use NoiseChisel for the
detection of small faint objects and large extended objects.
-In the meantime, they also show the modular principle behind Gnuastro's
programs and how they are built to complement, and build upon, each other.
+@node Single value measurements, Generating histograms and cumulative
frequency plots, Input to Statistics, Invoking aststatistics
+@subsubsection Single value measurements
-@ref{General program usage tutorial} culminates in using NoiseChisel to detect
galaxies and use its outputs to find the galaxy colors.
-Defining colors is a very common process in most science-cases.
-Therefore it is also recommended to (patiently) complete that tutorial for
optimal usage of NoiseChisel in conjunction with all the other Gnuastro
programs.
-@ref{Detecting large extended targets} shows you can optimize NoiseChisel's
settings for very extended objects to successfully carve out to signal-to-noise
ratio levels of below 1/10.
-After going through those tutorials, play a little with the settings (in the
order presented in the paper and @ref{Invoking astnoisechisel}) on a dataset
you are familiar with and inspect all the check images (options starting with
@option{--check}) to see the effect of each parameter.
+@table @option
-Below, in @ref{Invoking astnoisechisel}, we will review NoiseChisel's input,
detection, and output options in @ref{NoiseChisel input}, @ref{Detection
options}, and @ref{NoiseChisel output}.
-If you have used NoiseChisel within your research, please run it with
@option{--cite} to list the papers you should cite and how to acknowledge its
funding sources.
+@item -n
+@itemx --number
+Print the number of all used (non-blank and in range) elements.
-@menu
-* NoiseChisel changes after publication:: Updates since published papers.
-* Invoking astnoisechisel:: Options and arguments for NoiseChisel.
-@end menu
+@item --minimum
+Print the minimum value of all used elements.
-@node NoiseChisel changes after publication, Invoking astnoisechisel,
NoiseChisel, NoiseChisel
-@subsection NoiseChisel changes after publication
+@item --maximum
+Print the maximum value of all used elements.
-NoiseChisel was initially introduced in @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa [2015]} and updates after the first four years were
published in @url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}.
-To help in understanding how it works, those papers have many figures showing
every step on multiple mock and real examples.
-We recommended to read these papers for a good understanding of what it does
and how each parameter influences the output.
+@item --sum
+Print the sum of all used elements.
-However, the papers cannot be updated anymore, but NoiseChisel has evolved
(and will continue to do so): better algorithms or steps have been found and
implemented and some options have been added, removed or changed behavior.
-This book is thus the final and definitive guide to NoiseChisel.
-The aim of this section is to make the transition from the papers above to the
installed version on your system, as smooth as possible with the list below.
-For a more detailed list of changes in each Gnuastro version, please see the
@file{NEWS} file@footnote{The @file{NEWS} file is present in the released
Gnuastro tarball, see @ref{Release tarball}.}.
+@item -m
+@itemx --mean
+Print the mean (average) of all used elements.
-@itemize
-@item
-An improved outlier rejection for identifying tiles without any signal has
been implemented in the quantile-threshold phase:
-Prior to version 0.14, outliers were defined globally: the distribution of all
tiles with an acceptable @option{--meanmedqdiff} was inspected and outliers
were found and rejected.
-However, this caused problems when there are strong gradients over the image
(for example, an image prior to flat-fielding, or in the presence of a large
foreground galaxy).
-In these cases, the faint wings of galaxies/stars could be mistakenly
identified as Sky (leaving a footprint of the object on the Sky output) and
wrongly subtracted.
+@item -t
+@itemx --std
+Print the standard deviation of all used elements.
+
+@item -E
+@itemx --median
+Print the median of all used elements.
-It was possible to play with the parameters to correct this for that
particular dataset, but that was frustrating.
-Therefore from version 0.14, instead of finding outliers from the full tile
distribution, we now measure the @emph{slope} of the tile's nearby tiles and
find outliers locally.
-Three options have been added to configure this part of NoiseChisel:
@option{--outliernumngb}, @option{--outliersclip} and @option{--outliersigma}.
-For more on the local outlier-by-distance algorithm and the definition of
@emph{slope} mentioned above, see @ref{Quantifying signal in a tile}.
-In our tests, this gave a much improved estimate of the quantile thresholds
and final Sky values with default values.
-@end itemize
+@item -u FLT[,FLT[,...]]
+@itemx --quantile=FLT[,FLT[,...]]
+Print the values at the given quantiles of the input dataset.
+Any number of quantiles may be given and one number will be printed for each.
+Values can either be written as a single number or as fractions, but must be
between zero and one (inclusive).
+Hence, in effect @command{--quantile=0.25 --quantile=0.75} is equivalent to
@option{--quantile=0.25,3/4}, or @option{-u1/4,3/4}.
+The returned value is one of the elements from the dataset.
+Taking @mymath{q} to be your desired quantile, and @mymath{N} to be the total
number of used (non-blank and within the given range) elements, the returned
value is at the following position in the sorted array:
@mymath{round(q\times{}N}).
+@item --quantfunc=FLT[,FLT[,...]]
+Print the quantiles of the given values in the dataset.
+This option is the inverse of the @option{--quantile} and operates similarly
except that the acceptable values are within the range of the dataset, not
between 0 and 1.
+Formally it is known as the ``Quantile function''.
+Since the dataset is not continuous this function will find the nearest
element of the dataset and use its position to estimate the quantile function.
-@node Invoking astnoisechisel, , NoiseChisel changes after publication,
NoiseChisel
-@subsection Invoking NoiseChisel
+@item --quantofmean
+@cindex Quantile of the mean
+Print the quantile of the mean in the dataset.
+This is a very good measure of detecting skewness or outliers.
+The concept is used by programs like NoiseChisel to identify the presence of
signal in a tile of the image (because signal in noise causes skewness).
-NoiseChisel will detect signal in noise producing a multi-extension dataset
containing a binary detection map which is the same size as the input.
-Its output can be readily used for input into @ref{Segment}, for higher-level
segmentation, or @ref{MakeCatalog} to do measurements and generate a catalog.
-The executable name is @file{astnoisechisel} with the following general
template
+For example, take this simple array: @code{1 2 20 4 5 6 3}.
+The mean is @code{5.85}.
+The nearest element to this mean is @code{6} and the quantile of @code{6} in
this distribution is 0.8333.
+Here is how we got to this: in the sorted dataset (@code{1 2 3 4 5 6 20}),
@code{6} is the 5-th element (counting from zero, since a quantile of zero
corresponds to the minimum, by definition) and the maximum is the 6-th element
(again, counting from zero).
+So the quantile of the mean in this case is @mymath{5/6=0.8333}.
-@example
-$ astnoisechisel [OPTION ...] InputImage.fits
-@end example
+In the example above, if we had @code{7} instead of @code{20} (which was an
outlier), then the mean would be @code{4} and the quantile of the mean would be
0.5 (which by definition, is the quantile of the median), showing no outliers.
+As the number of elements increases, the mean itself is less affected by a
small number of outliers, but skewness can be nicely identified by the quantile
of the mean.
-@noindent
-One line examples:
+@item -O
+@itemx --mode
+Print the mode of all used elements.
+The mode is found through the mirror distribution which is fully described in
Appendix C of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
2015}.
+See that section for a full description.
-@example
-## Detect signal in input.fits.
-$ astnoisechisel input.fits
+This mode calculation algorithm is non-parametric, so when the dataset is not
large enough (larger than about 1000 elements usually), or does not have a
clear mode it can fail.
+In such cases, this option will return a value of @code{nan} (for the floating
point NaN value).
-## Inspect all the detection steps after changing a parameter.
-$ astnoisechisel input.fits --qthresh=0.4 --checkdetection
+As described in that paper, the easiest way to assess the quality of this mode
calculation method is to use it's symmetricity (see @option{--modesym} below).
+A better way would be to use the @option{--mirror} option to generate the
histogram and cumulative frequency tables for any given mirror value (the mode
in this case) as a table.
+If you generate plots like those shown in Figure 21 of that paper, then your
mode is accurate.
-## Detect signal assuming input has 4 amplifier channels along first
-## dimension and 1 along the second. Also set the regular tile size
-## to 100 along both dimensions:
-$ astnoisechisel --numchannels=4,1 --tilesize=100,100 input.fits
-@end example
+@item --modequant
+Print the quantile of the mode.
+You can get the actual mode value from the @option{--mode} described above.
+In many cases, the absolute value of the mode is irrelevant, but its position
within the distribution is important.
+In such cases, this option will become handy.
-@cindex Gaussian
-@noindent
-If NoiseChisel is to do processing (for example, you do not want to get help,
or see the values to each input parameter), an input image should be provided
with the recognized extensions (see @ref{Arguments}).
-NoiseChisel shares a large set of common operations with other Gnuastro
programs, mainly regarding input/output, general processing steps, and general
operating modes.
-To help in a unified experience between all of Gnuastro's programs, these
operations have the same command-line options, see @ref{Common options} for a
full list/description (they are not repeated here).
+@item --modesym
+Print the symmetricity of the calculated mode.
+See the description of @option{--mode} for more.
+This mode algorithm finds the mode based on how symmetric it is, so if the
symmetricity returned by this option is too low, the mode is not too accurate.
+See Appendix C of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
2015} for a full description.
+In practice, symmetricity values larger than 0.2 are mostly good.
-As in all Gnuastro programs, options can also be given to NoiseChisel in
configuration files.
-For a thorough description on Gnuastro's configuration file parsing, please
see @ref{Configuration files}.
-All of NoiseChisel's options with a short description are also always
available on the command-line with the @option{--help} option, see @ref{Getting
help}.
-To inspect the option values without actually running NoiseChisel, append your
command with @option{--printparams} (or @option{-P}).
+@item --modesymvalue
+Print the value in the distribution where the mirror and input
+distributions are no longer symmetric, see @option{--mode} and Appendix C
+of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa 2015} for
+more.
-NoiseChisel's input image may contain blank elements (see @ref{Blank pixels}).
-Blank elements will be ignored in all steps of NoiseChisel.
-Hence if your dataset has bad pixels which should be masked with a mask image,
please use Gnuastro's @ref{Arithmetic} program (in particular its
@command{where} operator) to convert those pixels to blank pixels before
running NoiseChisel.
-Gnuastro's Arithmetic program has bitwise operators helping you select
specific kinds of bad-pixels when necessary.
+@item --sigclip-number
+Number of elements after applying @mymath{\sigma}-clipping (see @ref{Sigma
clipping}).
+@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
-A convolution kernel can also be optionally given.
-If a value (file name) is given to @option{--kernel} on the command-line or in
a configuration file (see @ref{Configuration files}), then that file will be
used to convolve the image prior to thresholding.
-Otherwise a default kernel will be used.
-For a 2D image, the default kernel is a 2D Gaussian with a FWHM of 2 pixels
truncated at 5 times the FWHM.
-This choice of the default kernel is discussed in Section 3.1.1 of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}.
-For a 3D cube, it is a Gaussian with FWHM of 1.5 pixels in the first two
dimensions and 0.75 pixels in the third dimension.
-See @ref{Convolution kernel} for kernel related options.
-Passing @code{none} to @option{--kernel} will disable convolution.
-On the other hand, through the @option{--convolved} option, you may provide an
already convolved image, see descriptions below for more.
+@item --sigclip-median
+Median after applying @mymath{\sigma}-clipping (see @ref{Sigma clipping}).
+@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
-NoiseChisel defines two tessellations over the input (see @ref{Tessellation}).
-This enables it to deal with possible gradients in the input dataset and also
significantly improve speed by processing each tile on different threads
simultaneously.
-Tessellation related options are discussed in @ref{Processing options}.
-In particular, NoiseChisel uses two tessellations (with everything between
them identical except the tile sizes): a fine-grained one with smaller tiles
(used in thresholding and Sky value estimations) and another with larger tiles
which is used for pseudo-detections over non-detected regions of the image.
-The common Tessellation options described in @ref{Processing options} define
all parameters of both tessellations.
-The large tile size for the latter tessellation is set through the
@option{--largetilesize} option.
-To inspect the tessellations on your input dataset, run NoiseChisel with
@option{--checktiles}.
+@cindex Outlier
+Here is one scenario where this can be useful: assume you have a table and you
would like to remove the rows that are outliers (not within the
@mymath{\sigma}-clipping range).
+Let's assume your table is called @file{table.fits} and you only want to keep
the rows that have a value in @code{COLUMN} within the @mymath{\sigma}-clipped
range (to @mymath{3\sigma}, with a tolerance of 0.1).
+This command will return the @mymath{\sigma}-clipped median and standard
deviation (used to define the range later).
-@cartouche
-@noindent
-@strong{Usage TIP:} Frequently use the options starting with @option{--check}.
-Since the noise properties differ between different datasets, you can often
play with the parameters/options for a better result than the default
parameters.
-You can start with @option{--checkdetection} for the main steps.
-For the full list of NoiseChisel's checking options please run:
@example
-$ astnoisechisel --help | grep check
+$ aststatistics table.fits -cCOLUMN --sclipparams=3,0.1 \
+ --sigclip-median --sigclip-std
@end example
-@end cartouche
-@cartouche
-@noindent
-@strong{Not detecting wings of bright galaxies:} In such cases, probably the
best solution is to increase @option{--outliernumngb} (to reject tiles that are
affected by very flat diffuse signal).
-For more, see @ref{Quantifying signal in a tile}.
-@end cartouche
+@cindex GNU AWK
+You can then use the @option{--range} option of Table (see @ref{Table}) to
select the proper rows.
+But for that, you need the actual starting and ending values of the range
(@mymath{m\pm s\sigma}; where @mymath{m} is the median and @mymath{s} is the
multiple of sigma to define an outlier).
+Therefore, the raw outputs of Statistics in the command above are not enough.
-When working on 3D datacubes, the tessellation options need three values and
updating them every time can be annoying/buggy.
-To simplify the job, NoiseChisel also installs a @file{astnoisechisel-3d.conf}
configuration file (see @ref{Configuration files}).
-You can use this for default values on datacubes.
-For example, if you installed Gnuastro with the prefix @file{/usr/local} (the
default location, see @ref{Installation directory}), you can benefit from this
configuration file by running NoiseChisel like the example below.
+To get the starting and ending values of the non-outlier range (and put a
`@key{,}' between them, ready to be used in @option{--range}), pipe the result
into AWK.
+But in AWK, we will also need the multiple of @mymath{\sigma}, so we will
define it as a shell variable (@code{s}) before calling Statistics (note how
@code{$s} is used two times now):
@example
-$ astnoisechisel cube.fits \
- --config=/usr/local/etc/astnoisechisel-3d.conf
+$ s=3
+$ aststatistics table.fits -cCOLUMN --sclipparams=$s,0.1 \
+ --sigclip-median --sigclip-std \
+ | awk '@{s='$s'; printf("%f,%f\n", $1-s*$2, $1+s*$2)@}'
@end example
-@cindex Shell alias
-@cindex Alias (shell)
-@cindex Shell startup
-@cindex Startup, shell
-To further simplify the process, you can define a shell alias in any startup
file (for example, @file{~/.bashrc}, see @ref{Installation directory}).
-Assuming that you installed Gnuastro in @file{/usr/local}, you can add this
line to the startup file (you may put it all in one line, it is broken into two
lines here for fitting within page limits).
+To pass it onto Table, we will need to keep the printed output from the
command above in another shell variable (@code{r}), not print it.
+In Bash, can do this by putting the whole statement within a @code{$()}:
@example
-alias astnoisechisel-3d="astnoisechisel \
- --config=/usr/local/etc/astnoisechisel-3d.conf"
+$ s=3
+$ r=$(aststatistics table.fits -cCOLUMN --sclipparams=$s,0.1 \
+ --sigclip-median --sigclip-std \
+ | awk '@{s='$s'; printf("%f,%f\n", $1-s*$2, $1+s*$2)@}')
+$ echo $r # Just to confirm.
@end example
-@noindent
-Using this alias, you can call NoiseChisel with the name
@command{astnoisechisel-3d} (instead of @command{astnoisechisel}).
-It will automatically load the 3D specific configuration file first, and then
parse any other arguments, options or configuration files.
-You can change the default values in this 3D configuration file by calling
them on the command-line as you do with
@command{astnoisechisel}@footnote{Recall that for single-invocation options,
the last command-line invocation takes precedence over all previous invocations
(including those in the 3D configuration file).
-See the description of @option{--config} in @ref{Operating mode options}.}.
-For example:
+Now you can use Table with the @option{--range} option to only print the rows
that have a value in @code{COLUMN} within the desired range:
@example
-$ astnoisechisel-3d --numchannels=3,3,1 cube.fits
+$ asttable table.fits --range=COLUMN,$r
@end example
-In the sections below, NoiseChisel's options are classified into three general
classes to help in easy navigation.
-@ref{NoiseChisel input} mainly discusses the options relating to input and
those that are shared in both detection and segmentation.
-Options to configure the detection are described in @ref{Detection options}
and @ref{Segmentation options} we discuss how you can fine-tune the
segmentation of the detections.
-Finally, in @ref{NoiseChisel output} the format of NoiseChisel's output is
discussed.
-The order of options here follow the same logical order that the respective
action takes place within NoiseChisel (note that the output of @option{--help}
is sorted alphabetically).
-
-Below, we will discuss NoiseChisel's options, classified into two general
classes, to help in easy navigation.
-@ref{NoiseChisel input} mainly discusses the basic options relating to inputs
and prior to the detection process detection.
-Afterwards, @ref{Detection options} fully describes every configuration
parameter (option) related to detection and how they affect the final result.
-The order of options in this section follow the logical order within
NoiseChisel.
-On first reading (while you are still new to NoiseChisel), it is therefore
strongly recommended to read the options in the given order below.
-The output of @option{--printparams} (or @option{-P}) also has this order.
-However, the output of @option{--help} is sorted alphabetically.
-Finally, in @ref{NoiseChisel output} the format of NoiseChisel's output is
discussed.
+To save the resulting table (that is clean of outliers) in another file (for
example, named @file{cleaned.fits}, it can also have a @file{.txt} suffix),
just add @option{--output=cleaned.fits} to the command above.
-@menu
-* NoiseChisel input:: NoiseChisel's input options.
-* Detection options:: Configure detection in NoiseChisel.
-* NoiseChisel output:: NoiseChisel's output options and format.
-@end menu
+@item --sigclip-mean
+Mean after applying @mymath{\sigma}-clipping (see @ref{Sigma clipping}).
+@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
-@node NoiseChisel input, Detection options, Invoking astnoisechisel, Invoking
astnoisechisel
-@subsubsection NoiseChisel input
+@item --sigclip-std
+Standard deviation after applying @mymath{\sigma}-clipping (see @ref{Sigma
clipping}).
+@mymath{\sigma}-clipping configuration is done with the
@option{--sigclipparams} option.
-The options here can be used to configure the inputs and output of
NoiseChisel, along with some general processing options.
-Recall that you can always see the full list of Gnuastro's options with the
@option{--help} (see @ref{Getting help}), or @option{--printparams} (or
@option{-P}) to see their values (see @ref{Operating mode options}).
+@end table
-@table @option
+@node Generating histograms and cumulative frequency plots, Fitting options,
Single value measurements, Invoking aststatistics
+@subsubsection Generating histograms and cumulative freq.
-@item -k FITS
-@itemx --kernel=FITS
-File name of kernel to smooth the image before applying the threshold, see
@ref{Convolution kernel}.
-If no convolution is needed, give this option a value of @option{none}.
+The list of options below are for those statistical operations that output
more than one value.
+So while they can be called together in one run, their outputs will be
distinct (each one's output will usually be printed in more than one line).
-The first step of NoiseChisel is to convolve/smooth the image and use the
convolved image in multiple steps including the finding and applying of the
quantile threshold (see @option{--qthresh}).
-The @option{--kernel} option is not mandatory.
-If not called, for a 2D, image a 2D Gaussian profile with a FWHM of 2 pixels
truncated at 5 times the FWHM is used.
-This choice of the default kernel is discussed in Section 3.1.1 of Akhlaghi
and Ichikawa [2015].
+@table @option
-For a 3D cube, when no file name is given to @option{--kernel}, a Gaussian
with FWHM of 1.5 pixels in the first two dimensions and 0.75 pixels in the
third dimension will be used.
-The reason for this particular configuration is that commonly in astronomical
applications, 3D datasets do not have the same nature in all three dimensions,
commonly the first two dimensions are spatial (RA and Dec) while the third is
spectral (for example, wavelength).
-The samplings are also different, in the default case, the spatial sampling is
assumed to be larger than the spectral sampling, hence a wider FWHM in the
spatial directions, see @ref{Sampling theorem}.
+@item -A
+@itemx --asciihist
+Print an ASCII histogram of the usable values within the input dataset along
with some basic information like the example below (from the UVUDF
catalog@footnote{@url{https://asd.gsfc.nasa.gov/UVUDF/uvudf_rafelski_2015.fits.gz}}).
+The width and height of the histogram (in units of character widths and
heights on your command-line terminal) can be set with the
@option{--numasciibins} (for the width) and @option{--asciiheight} options.
-You can use MakeProfiles to build a kernel with any of its recognized profile
types and parameters.
-For more details, please see @ref{MakeProfiles output dataset}.
-For example, the command below will make a Moffat kernel (with
@mymath{\beta=2.8}) with FWHM of 2 pixels truncated at 10 times the FWHM.
+For a full description of the histogram, please see @ref{Histogram and
Cumulative Frequency Plot}.
+An ASCII plot is certainly very crude and cannot be used in any publication,
but it is very useful for getting a general feeling of the input dataset very
fast and easily on the command-line without having to take your hands off the
keyboard (which is a major distraction!).
+If you want to try it out, you can write it all in one line and ignore the
@key{\} and extra spaces.
@example
-$ astmkprof --oversample=1 --kernel=moffat,2,2.8,10
+$ aststatistics uvudf_rafelski_2015.fits.gz --hdu=1 \
+ --column=MAG_F160W --lessthan=40 \
+ --asciihist --numasciibins=55
+ASCII Histogram:
+Number: 8593
+Y: (linear: 0 to 660)
+X: (linear: 17.7735 -- 31.4679, in 55 bins)
+ | ****
+ | *****
+ | ******
+ | ********
+ | *********
+ | ***********
+ | **************
+ | *****************
+ | ***********************
+ | ********************************
+ |*** ***************************************************
+ |-------------------------------------------------------
@end example
-Since convolution can be the slowest step of NoiseChisel, for large datasets,
you can convolve the image once with Gnuastro's Convolve (see @ref{Convolve}),
and use the @option{--convolved} option to feed it directly to NoiseChisel.
-This can help getting faster results when you are playing/testing the
higher-level options.
-
-@item --khdu=STR
-HDU containing the kernel in the file given to the @option{--kernel}
-option.
-
-@item --convolved=FITS
-Use this file as the convolved image and do not do convolution (ignore
@option{--kernel}).
-NoiseChisel will just check the size of the given dataset is the same as the
input's size.
-If a wrong image (with the same size) is given to this option, the results
(errors, bugs, etc.) are unpredictable.
-So please use this option with care and in a highly controlled environment,
for example, in the scenario discussed below.
+@item --asciicfp
+Print the cumulative frequency plot of the usable elements in the input
dataset.
+Please see descriptions under @option{--asciihist} for more, the example below
is from the same input table as that example.
+To better understand the cumulative frequency plot, please see @ref{Histogram
and Cumulative Frequency Plot}.
-In almost all situations, as the input gets larger, the single most CPU (and
time) consuming step in NoiseChisel (and other programs that need a convolved
image) is convolution.
-Therefore minimizing the number of convolutions can save a significant amount
of time in some scenarios.
-One such scenario is when you want to segment NoiseChisel's detections using
the same kernel (with @ref{Segment}, which also supports this
@option{--convolved} option).
-This scenario would require two convolutions of the same dataset: once by
NoiseChisel and once by Segment.
-Using this option in both programs, only one convolution (prior to running
NoiseChisel) is enough.
+@example
+$ aststatistics uvudf_rafelski_2015.fits.gz --hdu=1 \
+ --column=MAG_F160W --lessthan=40 \
+ --asciicfp --numasciibins=55
+ASCII Cumulative frequency plot:
+Y: (linear: 0 to 8593)
+X: (linear: 17.7735 -- 31.4679, in 55 bins)
+ | *******
+ | **********
+ | ***********
+ | *************
+ | **************
+ | ***************
+ | *****************
+ | *******************
+ | ***********************
+ | ******************************
+ |*******************************************************
+ |-------------------------------------------------------
+@end example
-Another common scenario where this option can be convenient is when you are
testing NoiseChisel (or Segment) for the best parameters.
-You have to run NoiseChisel multiple times and see the effect of each change.
-However, once you are happy with the kernel, re-convolving the input on every
change of higher-level parameters will greatly hinder, or discourage, further
testing.
-With this option, you can convolve the input image with your chosen kernel
once before running NoiseChisel, then feed it to NoiseChisel on each test run
and thus save valuable time for better/more tests.
+@item -H
+@itemx --histogram
+Save the histogram of the usable values in the input dataset into a table.
+The first column is the value at the center of the bin and the second is the
number of points in that bin.
+If the @option{--cumulative} option is also called with this option in a run,
then the table will have three columns (the third is the cumulative frequency
plot).
+Through the @option{--numbins}, @option{--onebinstart}, or
@option{--manualbinrange}, you can modify the first column values and with
@option{--normalize} and @option{--maxbinone} you can modify the second columns.
+See below for the description of each.
-To build your desired convolution kernel, you can use @ref{MakeProfiles}.
-To convolve the image with a given kernel you can use @ref{Convolve}.
-Spatial domain convolution is mandatory: in the frequency domain, blank pixels
(if present) will cover the whole image and gradients will appear on the edges,
see @ref{Spatial vs. Frequency domain}.
+By default (when no @option{--output} is specified) a plain text table will be
created, see @ref{Gnuastro text table format}.
+If a FITS name is specified, you can use the common option
@option{--tableformat} to have it as a FITS ASCII or FITS binary format, see
@ref{Common options}.
+This table can then be fed into your favorite plotting tool and get a much
more clean and nice histogram than what the raw command-line can offer you
(with the @option{--asciihist} option).
-Below you can see an example of the second scenario: you want to see how
variation of the growth level (through the @option{--detgrowquant} option) will
affect the final result.
-Recall that you can ignore all the extra spaces, new lines, and backslash's
(`@code{\}') if you are typing in the terminal.
-In a shell script, remove the @code{$} signs at the start of the lines.
+@item --histogram2d
+Save the 2D histogram of two input columns into an output file, see @ref{2D
Histograms}.
+The output will have three columns: the first two are the coordinates of each
box's center in the first and second dimensions/columns.
+The third will be number of input points that fall within that box.
-@example
-## Make the kernel to convolve with.
-$ astmkprof --oversample=1 --kernel=gaussian,2,5
+@item -C
+@itemx --cumulative
+Save the cumulative frequency plot of the usable values in the input dataset
into a table, similar to @option{--histogram}.
-## Convolve the input with the given kernel.
-$ astconvolve input.fits --kernel=kernel.fits \
- --domain=spatial --output=convolved.fits
+@item -s
+@itemx --sigmaclip
+Do @mymath{\sigma}-clipping on the usable pixels of the input dataset.
+See @ref{Sigma clipping} for a full description on @mymath{\sigma}-clipping
and also to better understand this option.
+The @mymath{\sigma}-clipping parameters can be set through the
@option{--sclipparams} option (see below).
-## Run NoiseChisel with seven growth quantile values.
-$ for g in 60 65 70 75 80 85 90; do \
- astnoisechisel input.fits --convolved=convolved.fits \
- --detgrowquant=0.$g --output=$g.fits; \
- done
-@end example
+@item --mirror=FLT
+Make a histogram and cumulative frequency plot of the mirror distribution for
the given dataset when the mirror is located at the value to this option.
+The mirror distribution is fully described in Appendix C of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa 2015} and
currently it is only used to calculate the mode (see @option{--mode}).
+Just note that the mirror distribution is a discrete distribution like the
input, so while you may give any number as the value to this option, the actual
mirror value is the closest number in the input dataset to this value.
+If the two numbers are different, Statistics will warn you of the actual
mirror value used.
+This option will make a table as output.
+Depending on your selected name for the output, it will be either a FITS table
or a plain text table (which is the default).
+It contains three columns: the first is the center of the bins, the second is
the histogram (with the largest value set to 1) and the third is the normalized
cumulative frequency plot of the mirror distribution.
+The bins will be positioned such that the mode is on the starting interval of
one of the bins to make it symmetric around the mirror.
+With this output file and the input histogram (that you can generate in
another run of Statistics, using the @option{--onebinvalue}), it is possible to
make plots like Figure 21 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa 2015}.
-@item --chdu=STR
-The HDU/extension containing the convolved image in the file given to
@option{--convolved}.
+@end table
-@item -w FITS
-@itemx --widekernel=FITS
-File name of a wider kernel to use in estimating the difference of the mode
and median in a tile (this difference is used to identify the significance of
signal in that tile, see @ref{Quantifying signal in a tile}).
-As displayed in Figure 4 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa [2015]}, a wider kernel will help in identifying the skewness
caused by data in noise.
-The image that is convolved with this kernel is @emph{only} used for this
purpose.
-Once the mode is found to be sufficiently close to the median, the quantile
threshold is found on the image convolved with the sharper kernel
(@option{--kernel}), see @option{--qthresh}).
+The list of options below allow customization of the histogram and cumulative
frequency plots (for the @option{--histogram}, @option{--cumulative},
@option{--asciihist}, and @option{--asciicfp} options).
-Since convolution will significantly slow down the processing, this feature is
optional.
-When it is not given, the image that is convolved with @option{--kernel} will
be used to identify good tiles @emph{and} apply the quantile threshold.
-This option is mainly useful in conditions were you have a very large,
extended, diffuse signal that is still present in the usable tiles when using
@option{--kernel}.
-See @ref{Detecting large extended targets} for a practical demonstration on
how to inspect the tiles used in identifying the quantile threshold.
+@table @option
-@item --whdu=STR
-HDU containing the kernel file given to the @option{--widekernel} option.
+@item --numbins
+The number of bins (rows) to use in the histogram and the cumulative frequency
plot tables (outputs of @option{--histogram} and @option{--cumulative}).
-@item -L INT[,INT]
-@itemx --largetilesize=INT[,INT]
-The size of each tile for the tessellation with the larger tile sizes.
-Except for the tile size, all the other parameters for this tessellation are
taken from the common options described in @ref{Processing options}.
-The format is identical to that of the @option{--tilesize} option that is
discussed in that section.
-@end table
+@item --numasciibins
+The number of bins (characters) to use in the ASCII plots when printing the
histogram and the cumulative frequency plot (outputs of @option{--asciihist}
and @option{--asciicfp}).
-@node Detection options, NoiseChisel output, NoiseChisel input, Invoking
astnoisechisel
-@subsubsection Detection options
+@item --asciiheight
+The number of lines to use when printing the ASCII histogram and cumulative
frequency plot on the command-line (outputs of @option{--asciihist} and
@option{--asciicfp}).
-Detection is the process of separating the pixels in the image into two
groups: 1) Signal, and 2) Noise.
-Through the parameters below, you can customize the detection process in
NoiseChisel.
-Recall that you can always see the full list of NoiseChisel's options with the
@option{--help} (see @ref{Getting help}), or @option{--printparams} (or
@option{-P}) to see their values (see @ref{Operating mode options}).
+@item -n
+@itemx --normalize
+Normalize the histogram or cumulative frequency plot tables (outputs of
@option{--histogram} and @option{--cumulative}).
+For a histogram, the sum of all bins will become one and for a cumulative
frequency plot the last bin value will be one.
-@table @option
+@item --maxbinone
+Divide all the histogram values by the maximum bin value so it becomes one and
the rest are similarly scaled.
+In some situations (for example, if you want to plot the histogram and
cumulative frequency plot in one plot) this can be very useful.
-@item -Q FLT
-@itemx --meanmedqdiff=FLT
-The maximum acceptable distance between the quantiles of the mean and median
in each tile, see @ref{Quantifying signal in a tile}.
-The quantile threshold estimates are measured on tiles where the quantiles of
their mean and median are less distant than the value given to this option.
-For example, @option{--meanmedqdiff=0.01} means that only tiles where the
mean's quantile is between 0.49 and 0.51 (recall that the median's quantile is
0.5) will be used.
+@item --onebinstart=FLT
+Make sure that one bin starts with the value to this option.
+In practice, this will shift the bins used to find the histogram and
cumulative frequency plot such that one bin's lower interval becomes this value.
-@item -a INT
-@itemx --outliernumngb=INT
-Number of neighboring tiles to use for outlier rejection (mostly the wings of
bright stars or galaxies).
-For optimal detection of the wings of bright stars or galaxies, this is
@strong{the most important} option in NoiseChisel.
-This is because the extended wings of bright galaxies or stars (the PSF) can
become flat over the tile.
-In this case, they will satisfy the @option{--meanmedqdiff} condition and pass
that step.
-Therefore, to correctly identify such bad tiles, we need to look at the
neighboring nearby tiles.
-A tile that is on the wing of a bright galaxy/star will clearly be an outlier
when looking at the neighbors.
-For more on the details of the outlier rejection algorithm, see the latter
half of @ref{Quantifying signal in a tile}.
-If this option is given a value of zero, no outlier rejection will take place.
+For example, when a histogram range includes negative and positive values and
zero has a special significance in your analysis, then zero might fall
somewhere in one bin.
+As a result that bin will have counts of positive and negative.
+By setting @option{--onebinstart=0}, you can make sure that one bin will only
count negative values in the vicinity of zero and the next bin will only count
positive ones in that vicinity.
-@item --outliersclip=FLT,FLT
-@mymath{\sigma}-clipping parameters for the outlier rejection of the quantile
threshold.
-The format of the given values is similar to @option{--sigmaclip} below.
-In NoiseChisel, outlier rejection on tiles is used when identifying the
quantile thresholds (@option{--qthresh}, @option{--noerodequant}, and
@option{detgrowquant}).
+@cindex NaN
+Note that by default, the first row of the histogram and cumulative frequency
plot show the central values of each bin.
+So in the example above you will not see the 0.000 in the first column, you
will see two symmetric values.
-Outlier rejection is useful when the dataset contains a large and diffuse
(almost flat within each tile) signal.
-The flatness of the profile will cause it to successfully pass the mean-median
quantile difference test, so we will need to use the distribution of successful
tiles for removing these false positives.
-For more, see the latter half of @ref{Quantifying signal in a tile}.
+If the value is not within the usable input range, this option will be ignored.
+When it is, this option is the last operation before the bins are finalized,
therefore it has a higher priority than options like @option{--manualbinrange}.
-@item --outliersigma=FLT
-Multiple of sigma to define an outlier.
-If this option is given a value of zero, no outlier rejection will take place.
-For more see @option{--outliersclip} and the latter half of @ref{Quantifying
signal in a tile}.
+@item --manualbinrange
+Use the values given to the @option{--greaterequal} and @option{--lessthan} to
define the range of all bin-based calculations like the histogram.
+This option itself does not take any value, but just tells the program to use
the values of those two options instead of the minimum and maximum values of a
plot.
+If any of the two options are not given, then the minimum or maximum will be
used respectively.
+Therefore, if none of them are called calling this option is redundant.
-@item -t FLT
-@itemx --qthresh=FLT
-The quantile threshold to apply to the convolved image.
-The detection process begins with applying a quantile threshold to each of the
tiles in the small tessellation.
-The quantile is only calculated for tiles that do not have any significant
signal within them, see @ref{Quantifying signal in a tile}.
-Interpolation is then used to give a value to the unsuccessful tiles and it is
finally smoothed.
+The @option{--onebinstart} option has a higher priority than this option.
+In other words, @option{--onebinstart} takes effect after the range has been
finalized and the initial bins have been defined, therefore it has the power to
(possibly) shift the bins.
+If you want to manually set the range of the bins @emph{and} have one bin on a
special value, it is thus better to avoid @option{--onebinstart}.
-@cindex Quantile
-@cindex Binary image
-@cindex Foreground pixels
-@cindex Background pixels
-The quantile value is a floating point value between 0 and 1.
-Assume that we have sorted the @mymath{N} data elements of a distribution (the
pixels in each mesh on the convolved image).
-The quantile (@mymath{q}) of this distribution is the value of the element
with an index of (the nearest integer to) @mymath{q\times{N}} in the sorted
data set.
-After thresholding is complete, we will have a binary (two valued) image.
-The pixels above the threshold are known as foreground pixels (have a value of
1) while those which lie below the threshold are known as background (have a
value of 0).
+@item --numbins2=INT
+Similar to @option{--numbins}, but for the second column when a 2D histogram
is requested, see @option{--histogram2d}.
-@item --smoothwidth=INT
-Width of flat kernel used to smooth the interpolated quantile thresholds, see
@option{--qthresh} for more.
+@item --greaterequal2=FLT
+Similar to @option{--greaterequal}, but for the second column when a 2D
histogram is requested, see @option{--histogram2d}.
-@cindex NaN
-@item --checkqthresh
-Check the quantile threshold values on the mesh grid.
-A multi-extension FITS file, suffixed with @file{_qthresh.fits} will be
created showing each step of how the final quantile threshold is found.
-With this option, NoiseChisel will abort as soon as quantile estimation has
been completed, allowing you to inspect the steps leading to the final quantile
threshold, this can be disabled with @option{--continueaftercheck}.
-By default the output will have the same pixel size as the input, but with the
@option{--oneelempertile} option, only one pixel will be used for each tile
(see @ref{Processing options}).
+@item --lessthan2=FLT
+Similar to @option{--lessthan}, but for the second column when a 2D histogram
is requested, see @option{--histogram2d}.
-The key things to remember are:
-@itemize
-@item
-The measurements to find the thresholds are done on tiles that cover the whole
image in a tessellation.
-Recall that you can set the size of tiles with @option{--tilesize} and check
them with @option{--checktiles}.
-Therefore except for the first and last extensions, the rest only show tiles.
-@item
-NoiseChisel ultimately has three thresholds: the quantile threshold (that you
set with @option{--qthresh}), the no-erode quantile (set with
@option{--noerodequant}) and the growth quantile (set with
@option{--detgrowquant}).
-Therefore for each step, we have three extensions.
-@end itemize
+@item --onebinstart2=FLT
+Similar to @option{--onebinstart}, but for the second column when a 2D
histogram is requested, see @option{--histogram2d}.
-The output file will have the following extensions.
-Below, the extensions are put in the same order as you see in the file, with
their name.
+@end table
-@table @code
-@item CONVOLVED
-This is the input image after convolution with the kernel (which is a FWHM=2
Gaussian by default, but you can change with @option{--kernel}).
-Recall that the thresholds are defined on the convolved image.
+@node Fitting options, Contour options, Generating histograms and cumulative
frequency plots, Invoking aststatistics
+@subsubsection Fitting options
-@item QTHRESH_ERODE
-@itemx QTHRESH_NOERODE
-@itemx QTHRESH_EXPAND
-In these three extensions, the tiles that have a quantile-of-mean more/less
than 0.5 (quantile of median) @mymath{\pm d} are set to NaN (@mymath{d} is the
value given to @option{--meanmedqdiff}, see @ref{Quantifying signal in a tile}).
-Therefore the non-NaN tiles that you see here are the tiles where there is no
significant skewness (changing signal) within that tile.
-The only differing thing between the three extensions is the values of the
non-NaN tiles.
-These values will be used to construct the final threshold map over the whole
image.
+With the options below, you can customize the least squares fitting features
of Statistics.
+For a tutorial of the usage of least squares fitting in Statistics, please see
@ref{Least squares fitting}.
+Here, we will just review the details of each option.
-@item VALUE1_NO_OUTLIER
-@itemx VALUE2_NO_OUTLIER
-@itemx VALUE3_NO_OUTLIER
-All outlier tiles have been masked.
-The reason for removing outliers is that the quantile-of-mean is only
sensitive to signal that varies on a scale that is smaller than the tile size.
-Therefore the extended wings of large galaxies or bright stars (which vary on
scales much larger than the tile size) will pass that test.
-As described in @ref{Quantifying signal in a tile} outlier rejection is
customized through @option{--outliernumngb}, @option{--outliersclip} and
@option{--outliersigma}.
+To activate least squares fitting in Statistics, it is necessary to use the
@option{--fit} option to specify the type of fit you want to do.
+See the description of @option{--fit} for the various available fitting models.
+The fitting models that account for weights require three input columns, while
the non-weighted ones only take two input columns.
+Here is a summary of the input columns:
-@item THRESH1_INTERP
-@itemx THRESH2_INTERP
-@itemx THRESH3_INTERP
-Using the successful values that remain after the previous step, give values
to all (interpolate) the tiles in the image.
-The interpolation is done using the nearest-neighbor method: for each tile,
the N nearest neighbors are found and the median of their values is used to
fill it.
-You can set the value of N through the @option{--interpnumngb} option.
+@enumerate
+@item
+The first input column is assumed to be the independent variable (on the
horizontal axis of a plot, or @mymath{X} in the equations of each fit).
+@item
+The second input column is assumed to be the measured value (on the vertical
axis of a plot, or @mymath{Y} in the equation above).
+@item
+The third input column is only for fittings with a weight.
+It is assumed to be the ``weight'' of the measurement column.
+The nature of the ``weight'' can be set with the @option{--fitweight} option,
for example, if you have the standard deviation of the error in @mymath{Y}, you
can use @option{--fitweight=std} (which is the default, so unless the default
value has been changed, you will not need to set this).
+@end enumerate
-@item THRESH1_SMOOTH
-@itemx THRESH2_SMOOTH
-@itemx THRESH3_SMOOTH
-Smooth the interpolated image to remove the strong differences between
touching tiles.
-Because we used the median value of the N nearest neighbors in the previous
step, there can be strong discontinuities on the edges of tiles (which can
directly show in the image after applying the threshold).
-The scale of the smoothing (number of nearby tiles to smooth with) is set with
the @option{--smoothwidth} option.
+If three columns are given to a model without weight, or two columns are given
to a model that requires weights, Statistics will abort and inform you.
+Below you can see an example of fitting with the same linear model, once
weighted and once without weights.
-@item QTHRESH-APPLIED
-The pixels in this image can only have three values:
+@example
+$ aststatistics table.fits --column=X,Y --fit=linear
+$ aststatistics table.fits --column=X,Y,Yerr --fit=linear-weighted
+@end example
-@table @code
-@item 0
-These pixels had a value below the quantile threshold.
-@item 1
-These pixels had a value above the quantile threshold, but below the threshold
for no erosion.
-Therefore in the next step, NoiseChisel will erode (set them to 0) these
pixels if they are touching a 0-valued pixel.
-@item 2
-These pixels had a value above the no-erosion threshold.
-So NoiseChisel will not erode these pixels, it will only apply Opening to them
afterwards.
-Recall that this was done to avoid loosing sharp point-sources (like stars in
space-based imaging).
-@end table
+The output of the fitting can be in three modes listed below.
+For a complete example, see the tutorial in @ref{Least squares fitting}).
+@table @asis
+@item Human friendly format
+By default (for example, the commands above) the output is an elaborate
description of the model parameters.
+For example, @mymath{c_0} and @mymath{c_1} in the linear model
(@mymath{Y=c_0+c_1X}).
+Their covariance matrix and the reduced @mymath{\chi^2} of the fit are also
printed on the output.
+@item Raw numbers
+If you don't need the human friendly components of the output (which are
annoying when you want to parse the outputs in some scenarios), you can use
@option{--quiet} option.
+Only the raw output numbers will be printed.
+@item Estimate on a custom X column
+Through the @option{--fitestimate} option, you can specify an independent
table column to estimate the fit (it can also take a single value).
+See the description of this option for more.
@end table
-@item --blankasforeground
-In the erosion and opening steps below, treat blank elements as foreground
(regions above the threshold).
-By default, blank elements in the dataset are considered to be background, so
if a foreground pixel is touching it, it will be eroded.
-This option is irrelevant if the datasets contains no blank elements.
-
-When there are many blank elements in the dataset, treating them as foreground
will systematically erode their regions less, therefore systematically creating
more false positives.
-So use this option (when blank values are present) with care.
+@table @option
+@item -f STR
+@itemx --fit=STR
+The name of the fitting method to use.
+They are based on the @url{https://www.gnu.org/software/gsl/doc/html/lls.html,
linear} and @url{https://www.gnu.org/software/gsl/doc/html/nls.html, nonlinear}
least-squares fitting functions of the GNU Scientific Library (GSL).
+@table @code
+@item linear
+@mymath{Y=c_0+c_1X}
+@item linear-weighted
+@mymath{Y=c_0+c_1X}; accounting for ``weights'' in @mymath{Y}.
+@item linear-no-constant
+@mymath{Y=c_1X}.
+@item linear-no-constant-weighted
+@mymath{Y=c_1X}; accounting for ``weights'' in @mymath{Y}.
+@item polynomial
+@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}; the maximum required power
(@mymath{n}) is specified by @option{--fitmaxpower}.
+@item polynomial-weighted
+@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}; accounting for ``weights'' in
@mymath{Y}.
+The maximum required power (@mymath{n}) is specified by @option{--fitmaxpower}.
+@item polynomial-robust
+@cindex Robust polynomial fit
+@cindex Polynomial fit (robust)
+@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}; rejects outliers.
+The function to use for outlier removal can be specified with the
@option{--fitrobust} option described below.
+This model doesn't take weights since they are calculated internally based on
the outlier removal function (requires two input columns).
+The maximum required power (@mymath{n}) is specified by @option{--fitmaxpower}.
-@item -e INT
-@itemx --erode=INT
-@cindex Erosion
-The number of erosions to apply to the binary thresholded image.
-Erosion is simply the process of flipping (from 1 to 0) any of the foreground
pixels that neighbor a background pixel.
-In a 2D image, there are two kinds of neighbors, 4-connected and 8-connected
neighbors.
-In a 3D dataset, there are three: 6-connected, 18-connected, and 26-connected.
-You can specify which class of neighbors should be used for erosion with the
@option{--erodengb} option, see below.
+For a comprehensive review of ``robust'' fitting and the available functions,
please see the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#robust-linear-regression,
Robust linear regression} section of the GNU Scientific Library.
+@end table
-Erosion has the effect of shrinking the foreground pixels.
-To put it another way, it expands the holes.
-This is a founding principle in NoiseChisel: it exploits the fact that with
very low thresholds, the holes in the very low surface brightness regions of an
image will be smaller than regions that have no signal.
-Therefore by expanding those holes, we are able to separate the regions
harboring signal.
+@item --fitweight=STR
+The nature of the ``weight'' column (when a weight is necessary for the model).
+It can take one of the following values:
+@table @code
+@item std
+Standard deviation of each @mymath{Y} axis measurement: this is the usual
``error'' associated with a measurement (for example, in @ref{MakeCatalog}) and
is the default value to this option.
+@item var
+Variance of each @mymath{Y} axis measurement.
+Assuming a Gaussian distribution with standard deviation @mymath{\sigma}, the
variance is @mymath{\sigma^2}.
+@item inv-var
+Inverse variance of each @mymath{Y} axis measurement.
+Assuming a Gaussian distribution with standard deviation @mymath{\sigma}, the
variance is @mymath{1/\sigma^2}.
+@end table
-@item --erodengb=INT
-The type of neighborhood (structuring element) used in erosion, see
@option{--erode} for an explanation on erosion.
-If the input is a 2D image, only two integer values are acceptable: 4 or 8.
-For a 3D input datacube, the acceptable values are: 6, 18 and 26.
+@item --fitmaxpower=INT
+The maximum power (an integer) in a polynomial (@mymath{n} in
@mymath{Y=c_0+c_1X+c_2X^2+\cdots+c_nX^n}).
+This is only relevant when one of the polynomial models is given to
@option{--fit}.
+The fit will return @mymath{n+1} coefficients.
-In 2D 4-connectivity, the neighbors of a pixel are defined as the four pixels
on the top, bottom, right and left of a pixel that share an edge with it.
-The 8-connected neighbors on the other hand include the 4-connected neighbors
along with the other 4 pixels that share a corner with this pixel.
-See Figure 6 (a) and (b) in Akhlaghi and Ichikawa (2015) for a demonstration.
-A similar argument applies to 3D datacubes.
+@item --fitrobust=STR
+The function for rejecting outliers in the @code{polynomial-robust} fitting
model.
+For a comprehensive review of ``robust'' fitting and the available functions,
please see the
@url{https://www.gnu.org/software/gsl/doc/html/lls.html#robust-linear-regression,
Robust linear regression} section of the GNU Scientific Library.
+This function can take the following values:
+@table @code
+@item bisquare
+@cindex Tukey’s biweight (bisquare) function
+@cindex Biweight function of Tukey
+@cindex Bisquare function of Tukey
+Tukey’s biweight (bisquare) function, this is the default function.
+According to the GSL manual, this is a good general purpose weight function.
+@item cauchy
+@cindex Cauchy's function (robust weight)
+@cindex Lorentzian function (robust weight)
+Cauchy’s function (also known as the Lorentzian function).
+It doesn't guarantee a unique solution, so it should be used with care.
+@item fair
+@cindex Fair function (robust weight)
+The fair function.
+It guarantees a unique solution and has continuous derivatives to three orders.
+@item huber
+@cindex Huber function (robust weight)
+Huber's @mymath{\rho} function.
+This is also a good general purpose weight function for rejecting outliers,
but can cause difficulty in some special scenarios.
+@item ols
+Ordinary Least Squares (OLS) solution with a constant weight of unity.
+@item welsch
+@cindex Welsch function (robust weight)
+Welsch function which is useful when the residuals follow an exponential
distribution.
+@end table
-@item --noerodequant
-Pure erosion is going to carve off sharp and small objects completely out of
the detected regions.
-This option can be used to avoid missing such sharp and small objects (which
have significant pixels, but not over a large area).
-All pixels with a value larger than the significance level specified by this
option will not be eroded during the erosion step above.
-However, they will undergo the erosion and dilation of the opening step below.
+@item --fitestimate=STR/FLT
+Estimate the fitted function at a single point or a complete column of points.
+The input @mymath{X} axis positions to estimate the function can be specified
in the following ways:
+@itemize
+@item
+A real number: the fitted function will be estimated at that @mymath{X}
position and the corresponding @mymath{Y} and its error will be printed to
standard output.
+@item
+@code{self}: in this mode, the same X axis column that was used in the fit
will be used for estimating the fitted function.
+This can be useful to visually/easily check the fit, see @ref{Least squares
fitting}.
+@item
+A file name: If the value is none of the above, Statistics expects it to be a
file name containing a table.
+If the file is a FITS file, the HDU containing the table should be specified
with the @option{--fitestimatehdu} option.
+The column of the table to use for the @mymath{X} axis points should be
specified with the @option{--fitestimatecol} option.
+@end itemize
+The output in this mode can be customized in the following ways:
+@itemize
+@item
+If a single floating point value is given @option{--fitestimate}, the fitted
function will be estimated on that point and printed to standard output.
+@item
+When nothing is given to @option{--output}, the independent column and the
estimated values and errors are printed on the standard output.
+@item
+If a file name is given to @option{--output}, the estimated table above is
saved in that file.
+It can have any of the formats in @ref{Recognized table formats}.
+As a FITS file, all the fit outputs (coefficients, covariance matrix and
reduced @mymath{\chi^2}) are kept as FITS keywords in the same HDU of the
estimated table.
+For a complete example, see @ref{Least squares fitting}.
-Like the @option{--qthresh} option, the significance level is determined using
the quantile (a value between 0 and 1).
-Just as a reminder, in the normal distribution, @mymath{1\sigma},
@mymath{1.5\sigma}, and @mymath{2\sigma} are approximately on the 0.84, 0.93,
and 0.98 quantiles.
+When the covariance matrix (and thus the @mymath{\chi^2}) cannot be calculated
(for example if you only have two rows!), the printed values on the terminal
will be NaN.
+However, the FITS standard does not allow NaN values in keyword values!
+Therefore, when writing the @mymath{\chi^2} and covariance matrix elements
into the output FITS keywords, the largest value of the 64-bit floating point
type will be written: @mymath{1.79769313486232\times10^{308}}; see @ref{Numeric
data types}.
-@item -p INT
-@itemx --opening=INT
-Depth of opening to be applied to the eroded binary image.
-Opening is a composite operation.
-When opening a binary image with a depth of @mymath{n}, @mymath{n} erosions
(explained in @option{--erode}) are followed by @mymath{n} dilations.
-Simply put, dilation is the inverse of erosion.
-When dilating an image any background pixel is flipped (from 0 to 1) to become
a foreground pixel.
-Dilation has the effect of fattening the foreground.
-Note that in NoiseChisel, the erosion which is part of opening is independent
of the initial erosion that is done on the thresholded image (explained in
@option{--erode}).
-The structuring element for the opening can be specified with the
@option{--openingngb} option.
-Opening has the effect of removing the thin foreground connections (mostly
noise) between separate foreground `islands' (detections) thereby completely
isolating them.
-Once opening is complete, we have @emph{initial} detections.
+@item
+When @option{--quiet} is given with @option{--fitestimate}, the fitted
parameters are no longer printed on the standard output; they are available as
FITS keywords in the file given to @option{--output}.
+@end itemize
-@item --openingngb=INT
-The structuring element used for opening, see @option{--erodengb} for more
information about a structuring element.
+@item --fitestimatehdu=STR/INT
+HDU name or counter (counting from zero) that contains the table to be used
for the estimating the fitted function over many points through
@option{--fitestimate}.
+For more on selecting a HDU, see the description of @option{--hdu} in
@ref{Input output options}.
-@item --skyfracnoblank
-Ignore blank pixels when estimating the fraction of undetected pixels for Sky
estimation.
-NoiseChisel only measures the Sky over the tiles that have a sufficiently
large fraction of undetected pixels (value given to @option{--minskyfrac}).
-By default this fraction is found by dividing number of undetected pixels in a
tile by the tile's area.
-But this default behavior ignores the possibility of blank pixels.
-In situations that blank/masked pixels are scattered across the image and if
they are large enough, all the tiles can fail the @option{--minskyfrac} test,
thus not allowing NoiseChisel to proceed.
-With this option, such scenarios can be fixed: the denominator of the fraction
will be the number of non-blank elements in the tile, not the total tile area.
+@item --fitestimatecol=STR/INT
+Column name or counter (counting from one) that contains the table to be used
for the estimating the fitted function over many points through
@option{--fitestimate}.
+See @ref{Selecting table columns}.
+@end table
-@item -B FLT
-@itemx --minskyfrac=FLT
-Minimum fraction (value between 0 and 1) of Sky (undetected) areas in a tile.
-Only tiles with a fraction of undetected pixels (Sky) larger than this value
will be used to estimate the Sky value.
-NoiseChisel uses this option value twice to estimate the Sky value: after
initial detections and in the end when false detections have been removed.
-Because of the PSF and their intrinsic amorphous properties, astronomical
objects (except cosmic rays) never have a clear cutoff and commonly sink into
the noise very slowly.
-Even below the very low thresholds used by NoiseChisel.
-So when a large fraction of the area of one mesh is covered by detections, it
is very plausible that their faint wings are present in the undetected regions
(hence causing a bias in any measurement).
-To get an accurate measurement of the above parameters over the tessellation,
tiles that harbor too many detected regions should be excluded.
-The used tiles are visible in the respective @option{--check} option of the
given step.
-@item --checkdetsky
-Check the initial approximation of the sky value and its standard deviation in
a FITS file ending with @file{_detsky.fits}.
-With this option, NoiseChisel will abort as soon as the sky value used for
defining pseudo-detections is complete.
-This allows you to inspect the steps leading to the final quantile threshold,
this behavior can be disabled with @option{--continueaftercheck}.
-By default the output will have the same pixel size as the input, but with the
@option{--oneelempertile} option, only one pixel will be used for each tile
(see @ref{Processing options}).
-@item -s FLT,FLT
-@itemx --sigmaclip=FLT,FLT
-The @mymath{\sigma}-clipping parameters for measuring the initial and final
Sky values from the undetected pixels, see @ref{Sigma clipping}.
-This option takes two values which are separated by a comma (@key{,}).
-Each value can either be written as a single number or as a fraction of two
numbers (for example, @code{3,1/10}).
-The first value to this option is the multiple of @mymath{\sigma} that will be
clipped (@mymath{\alpha} in that section).
-The second value is the exit criteria.
-If it is less than 1, then it is interpreted as tolerance and if it is larger
than one it is assumed to be the fixed number of iterations.
-Hence, in the latter case the value must be an integer.
+@node Contour options, Statistics on tiles, Fitting options, Invoking
aststatistics
+@subsubsection Contour options
-@item -R FLT
-@itemx --dthresh=FLT
-The detection threshold: a multiple of the initial Sky standard deviation
added with the initial Sky approximation (which you can inspect with
@option{--checkdetsky}).
-This flux threshold is applied to the initially undetected regions on the
unconvolved image.
-The background pixels that are completely engulfed in a 4-connected foreground
region are converted to background (holes are filled) and one opening (depth of
1) is applied over both the initially detected and undetected regions.
-The Signal to noise ratio of the resulting `pseudo-detections' are used to
identify true vs. false detections.
-See Section 3.1.5 and Figure 7 in Akhlaghi and Ichikawa (2015) for a very
complete explanation.
+Contours are useful to highlight the 2D shape of a certain flux level over an
image.
+To derive contours in Statistics, you can use the option below:
-@item --dopening=INT
-The number of openings to do after applying @option{--dthresh}.
+@table @option
-@item --dopeningngb=INT
-The connectivity used in the opening of @option{--dopening}.
-In a 2D image this must be either 4 or 8.
-The stronger the connectivity, the more smaller regions will be discarded.
+@item -R FLT[,FLT[,FLT...]]
+@itemx --contour=FLT[,FLT[,FLT...]]
+@cindex Contour
+@cindex Plot: contour
+Write the contours for the requested levels in a file ending with
@file{_contour.txt}.
+It will have three columns: the first two are the coordinates of each point
and the third is the level it belongs to (one of the input values).
+Each disconnected contour region will be separated by a blank line.
+This is the requested format for adding contours with PGFPlots in @LaTeX{}.
+If any other format can be useful for your work please let us know so we can
add it.
+If the image has World Coordinate System information, the written coordinates
will be in RA and Dec, otherwise, they will be in pixel coordinates.
-@item --holengb=INT
-The connectivity (defined by the number of neighbors) to fill holes after
applying @option{--dthresh} (above) to find pseudo-detections.
-For example, in a 2D image it must be 4 (the neighbors that are most strongly
connected) or 8 (all neighbors).
-The stronger the connectivity, the stronger the hole will be enclosed.
-So setting a value of 8 in a 2D image means that the walls of the hole are
4-connected.
-If standard (near Sky level) values are given to @option{--dthresh}, setting
@option{--holengb=4}, might fill the complete dataset and thus not create
enough pseudo-detections.
+Note that currently, this is a very crude/simple implementation, please let us
know if you find problematic situations so we can fix it.
+@end table
-@item --pseudoconcomp=INT
-The connectivity (defined by the number of neighbors) to find individual
pseudo-detections.
-If it is a weaker connectivity (4 in a 2D image), then pseudo-detections that
are connected on the corners will be treated as separate.
+@node Statistics on tiles, , Contour options, Invoking aststatistics
+@subsubsection Statistics on tiles
-@item -m INT
-@itemx --snminarea=INT
-The minimum area to calculate the Signal to noise ratio on the
pseudo-detections of both the initially detected and undetected regions.
-When the area in a pseudo-detection is too small, the Signal to noise ratio
measurements will not be accurate and their distribution will be heavily skewed
to the positive.
-So it is best to ignore any pseudo-detection that is smaller than this area.
-Use @option{--detsnhistnbins} to check if this value is reasonable or not.
+All the options described until now were from the first class of operations
discussed above: those that treat the whole dataset as one.
+However, it often happens that the relative position of the dataset elements
over the dataset is significant.
+For example, you do not want one median value for the whole input image, you
want to know how the median changes over the image.
+For such operations, the input has to be tessellated (see @ref{Tessellation}).
+Thus this class of options cannot currently be called along with the options
above in one run of Statistics.
-@item --checksn
-Save the S/N values of the pseudo-detections (and possibly grown detections if
@option{--cleangrowndet} is called) into separate tables.
-If @option{--tableformat} is a FITS table, each table will be written into a
separate extension of one file suffixed with @file{_detsn.fits}.
-If it is plain text, a separate file will be made for each table (ending in
@file{_detsn_sky.txt}, @file{_detsn_det.txt} and @file{_detsn_grown.txt}).
-For more on @option{--tableformat} see @ref{Input output options}.
+@table @option
-You can use these to inspect the S/N values and their distribution (in
combination with the @option{--checkdetection} option to see where the
pseudo-detections are).
-You can use Gnuastro's @ref{Statistics} to make a histogram of the
distribution or any other analysis you would like for better understanding of
the distribution (for example, through a histogram).
+@item -t
+@itemx --ontile
+Do the respective single-valued calculation over one tile of the input
dataset, not the whole dataset.
+This option must be called with at least one of the single valued options
discussed above (for example, @option{--mean} or @option{--quantile}).
+The output will be a file in the same format as the input.
+If the @option{--oneelempertile} option is called, then one element/pixel will
be used for each tile (see @ref{Processing options}).
+Otherwise, the output will have the same size as the input, but each element
will have the value corresponding to that tile's value.
+If multiple single valued operations are called, then for each operation there
will be one extension in the output FITS file.
-@item --minnumfalse=INT
-The minimum number of `pseudo-detections' over the undetected regions to
identify a Signal-to-Noise ratio threshold.
-The Signal to noise ratio (S/N) of false pseudo-detections in each tile is
found using the quantile of the S/N distribution of the pseudo-detections over
the undetected pixels in each mesh.
-If the number of S/N measurements is not large enough, the quantile will not
be accurate (can have large scatter).
-For example, if you set @option{--snquant=0.99} (or the top 1 percent), then
it is best to have at least 100 S/N measurements.
+@item -y
+@itemx --sky
+Estimate the Sky value on each tile as fully described in @ref{Quantifying
signal in a tile}.
+As described in that section, several options are necessary to configure the
Sky estimation which are listed below.
+The output file will have two extensions: the first is the Sky value and the
second is the Sky standard deviation on each tile.
+Similar to @option{--ontile}, if the @option{--oneelempertile} option is
called, then one element/pixel will be used for each tile (see @ref{Processing
options}).
-@item -c FLT
-@itemx --snquant=FLT
-The quantile of the Signal to noise ratio distribution of the
pseudo-detections in each mesh to use for filling the large mesh grid.
-Note that this is only calculated for the large mesh grids that satisfy the
minimum fraction of undetected pixels (value of @option{--minbfrac}) and
minimum number of pseudo-detections (value of @option{--minnumfalse}).
+@end table
-@item --snthresh=FLT
-Manually set the signal-to-noise ratio of true pseudo-detections.
-With this option, NoiseChisel will not attempt to find pseudo-detections over
the noisy regions of the dataset, but will directly go onto applying the
manually input value.
+The parameters for estimating the sky value can be set with the following
options, except for the @option{--sclipparams} option (which is also used by
the @option{--sigmaclip}), the rest are only used for the Sky value estimation.
-This option is useful in crowded images where there is no blank sky to find
the sky pseudo-detections.
-You can get this value on a similarly reduced dataset (from another region of
the Sky with more undetected regions spaces).
+@table @option
-@item -d FLT
-@itemx --detgrowquant=FLT
-Quantile limit to ``grow'' the final detections.
-As discussed in the previous options, after applying the initial quantile
threshold, layers of pixels are carved off the objects to identify true signal.
-With this step you can return those low surface brightness layers that were
carved off back to the detections.
-To disable growth, set the value of this option to @code{1}.
+@item -k=FITS
+@itemx --kernel=FITS
+File name of kernel to help in estimating the significance of signal in a
+tile, see @ref{Quantifying signal in a tile}.
-The process is as follows: after the true detections are found, all the
non-detected pixels above this quantile will be put in a list and used to
``grow'' the true detections (seeds of the growth).
-Like all quantile thresholds, this threshold is defined and applied to the
convolved dataset.
-Afterwards, the dataset is dilated once (with minimum connectivity) to connect
very thin regions on the boundary: imagine building a dam at the point rivers
spill into an open sea/ocean.
-Finally, all holes are filled.
-In the geography metaphor, holes can be seen as the closed (by the dams)
rivers and lakes, so this process is like turning the water in all such rivers
and lakes into soil.
-See @option{--detgrowmaxholesize} for configuring the hole filling.
+@item --khdu=STR
+Kernel HDU to help in estimating the significance of signal in a tile, see
+@ref{Quantifying signal in a tile}.
-Note that since the growth occurs on all neighbors of a data element, the
-quantile for 3D detection must be must larger than that of 2D
-detection. Recall that in 2D each element has 8 neighbors while in 3D there
-are 27 neighbors.
+@item --meanmedqdiff=FLT
+The maximum acceptable distance between the quantiles of the mean and median,
see @ref{Quantifying signal in a tile}.
+The initial Sky and its standard deviation estimates are measured on tiles
where the quantiles of their mean and median are less distant than the value
given to this option.
+For example, @option{--meanmedqdiff=0.01} means that only tiles where the
mean's quantile is between 0.49 and 0.51 (recall that the median's quantile is
0.5) will be used.
-@item --detgrowmaxholesize=INT
-The maximum hole size to fill during the final expansion of the true
detections as described in @option{--detgrowquant}.
-This is necessary when the input contains many smaller objects and can be used
to avoid marking blank sky regions as detections.
+@item --sclipparams=FLT,FLT
+The @mymath{\sigma}-clipping parameters, see @ref{Sigma clipping}.
+This option takes two values which are separated by a comma (@key{,}).
+Each value can either be written as a single number or as a fraction of two
numbers (for example, @code{3,1/10}).
+The first value to this option is the multiple of @mymath{\sigma} that will be
clipped (@mymath{\alpha} in that section).
+The second value is the exit criteria.
+If it is less than 1, then it is interpreted as tolerance and if it is larger
than one it is a specific number.
+Hence, in the latter case the value must be an integer.
-For example, multiple galaxies can be positioned such that they surround an
empty region of sky.
-If all the holes are filled, the Sky region in between them will be taken as a
detection which is not desired.
-To avoid such cases, the integer given to this option must be smaller than the
hole between such objects.
-However, we should caution that unless the ``hole'' is very large, the
combined faint wings of the galaxies might actually be present in between them,
so be very careful in not filling such holes.
+@item --outliersclip=FLT,FLT
+@mymath{\sigma}-clipping parameters for the outlier rejection of the Sky
+value (similar to @option{--sclipparams}).
-On the other hand, if you have a very large (and extended) galaxy, the diffuse
wings of the galaxy may create very large holes over the detections.
-In such cases, a large enough value to this option will cause all such holes
to be detected as part of the large galaxy and thus help in detecting it to
extremely low surface brightness limits.
-Therefore, especially when large and extended objects are present in the
image, it is recommended to give this option (very) large values.
-For one real-world example, see @ref{Detecting large extended targets}.
+Outlier rejection is useful when the dataset contains a large and diffuse
(almost flat within each tile) signal.
+The flatness of the profile will cause it to successfully pass the mean-median
quantile difference test, so we will need to use the distribution of successful
tiles for removing these false positive.
+For more, see the latter half of @ref{Quantifying signal in a tile}.
-@item --cleangrowndet
-After dilation, if the signal-to-noise ratio of a detection is less than the
derived pseudo-detection S/N limit, that detection will be discarded.
-In an ideal/clean noise, a true detection's S/N should be larger than its
constituent pseudo-detections because its area is larger and it also covers
more signal.
-However, on a false detections (especially at lower @option{--snquant}
values), the increase in size can cause a decrease in S/N below that threshold.
+@item --outliernumngb=INT
+Number of neighboring tiles to use for outlier rejection (mostly the wings of
bright stars or galaxies).
+If this option is given a value of zero, no outlier rejection will take place.
+For more see the latter half of @ref{Quantifying signal in a tile}.
-This will improve purity and not change completeness (a true detection will
not be discarded).
-Because a true detection has flux in its vicinity and dilation will catch more
of that flux and increase the S/N.
-So on a true detection, the final S/N cannot be less than pseudo-detections.
+@item --outliersigma=FLT
+Multiple of sigma to define an outlier in the Sky value estimation.
+If this option is given a value of zero, no outlier rejection will take place.
+For more see @option{--outliersclip} and the latter half of @ref{Quantifying
signal in a tile}.
-However, in many real images bad processing creates artifacts that cannot be
accurately removed by the Sky subtraction.
-In such cases, this option will decrease the completeness (will artificially
discard true detections).
-So this feature is not default and should to be explicitly called when you
know the noise is clean.
+@item --smoothwidth=INT
+Width of a flat kernel to convolve the interpolated tile values.
+Tile interpolation is done using the median of the @option{--interpnumngb}
neighbors of each tile (see @ref{Processing options}).
+If this option is given a value of zero or one, no smoothing will be done.
+Without smoothing, strong boundaries will probably be created between the
values estimated for each tile.
+It is thus good to smooth the interpolated image so strong discontinuities do
not show up in the final Sky values.
+The smoothing is done through convolution (see @ref{Convolution process}) with
a flat kernel, so the value to this option must be an odd number.
+@item --ignoreblankintiles
+Do Not set the input's blank pixels to blank in the tiled outputs (for
example, Sky and Sky standard deviation extensions of the output).
+This is only applicable when the tiled output has the same size as the input,
in other words, when @option{--oneelempertile} is not called.
-@item --checkdetection
-Every step of the detection process will be added as an extension to a file
with the suffix @file{_det.fits}.
-Going through each would just be a repeat of the explanations above and also
of those in Akhlaghi and Ichikawa (2015).
-The extension label should be sufficient to recognize which step you are
observing.
-Viewing all the steps can be the best guide in choosing the best set of
parameters.
-With this option, NoiseChisel will abort as soon as a snapshot of all the
detection process is saved.
-This behavior can be disabled with @option{--continueaftercheck}.
+By default, blank values in the input (commonly on the edges which are outside
the survey/field area) will be set to blank in the tiled outputs also.
+But in other scenarios this default behavior is not desired; for example, if
you have masked something in the input, but want the tiled output under that
also.
@item --checksky
-Check the derivation of the final sky and its standard deviation values on the
mesh grid.
-With this option, NoiseChisel will abort as soon as the sky value is estimated
over the image (on each tile).
-This behavior can be disabled with @option{--continueaftercheck}.
-By default the output will have the same pixel size as the input, but with the
@option{--oneelempertile} option, only one pixel will be used for each tile
(see @ref{Processing options}).
+Create a multi-extension FITS file showing the steps that were used to
estimate the Sky value over the input, see @ref{Quantifying signal in a tile}.
+The file will have two extensions for each step (one for the Sky and one for
the Sky standard deviation).
@end table
+@node NoiseChisel, Segment, Statistics, Data analysis
+@section NoiseChisel
+@cindex Labeling
+@cindex Detection
+@cindex Segmentation
+Once instrumental signatures are removed from the raw data (image) in the
initial reduction process (see @ref{Data manipulation}).
+You are naturally eager to start answering the scientific questions that
motivated the data collection in the first place.
+However, the raw dataset/image is just an array of values/pixels, that is all!
These raw values cannot directly be used to answer your scientific questions;
for example, ``how many galaxies are there in the image?'' and ``What is their
magnitude?''.
+The first high-level step your analysis will therefore be to classify, or
label, the dataset elements (pixels) into two classes:
+1) Noise, where random effects are the major contributor to the value, and
+2) Signal, where non-random factors (for example, light from a distant galaxy)
are present.
+This classification of the elements in a dataset is formally known as
@emph{detection}.
+In an observational/experimental dataset, signal is always buried in noise:
only mock/simulated datasets are free of noise.
+Therefore detection, or the process of separating signal from noise,
determines the number of objects you study and the accuracy of any higher-level
measurement you do on them.
+Detection is thus the most important step of any analysis and is not trivial.
+In particular, the most scientifically interesting astronomical targets are
faint, can have a large variety of morphologies, along with a large
distribution in magnitude and size.
+Therefore when noise is significant, proper detection of your targets is a
uniquely decisive step in your final scientific analysis/result.
-@node NoiseChisel output, , Detection options, Invoking astnoisechisel
-@subsubsection NoiseChisel output
+@cindex Erosion
+NoiseChisel is Gnuastro's program for detection of targets that do not have a
sharp border (almost all astronomical objects).
+When the targets have sharp edges/borders (for example, cells in biological
imaging), a simple threshold is enough to separate them from noise and each
other (if they are not touching).
+To detect such sharp-edged targets, you can use Gnuastro's Arithmetic program
in a command like below (assuming the threshold is @code{100}, see
@ref{Arithmetic}):
-NoiseChisel's output is a multi-extension FITS file.
-The main extension/dataset is a (binary) detection map.
-It has the same size as the input but with only two possible values for all
pixels: 0 (for pixels identified as noise) and 1 (for those identified as
signal/detections).
-The detection map is followed by a Sky and Sky standard deviation dataset
(which are calculated from the binary image).
-By default (when @option{--rawoutput} is not called), NoiseChisel will also
subtract the Sky value from the input and save the sky-subtracted input as the
first extension in the output with data.
-The zero-th extension (that contains no data), contains NoiseChisel's
configuration as FITS keywords, see @ref{Output FITS files}.
+@example
+$ astarithmetic in.fits 100 gt 2 connected-components
+@end example
-The name of the output file can be set by giving a value to @option{--output}
(this is a common option between all programs and is therefore discussed in
@ref{Input output options}).
-If @option{--output} is not used, the input name will be suffixed with
@file{_detected.fits} and used as output, see @ref{Automatic output}.
-If any of the options starting with @option{--check*} are given, NoiseChisel
will not complete and will abort as soon as the respective check images are
created.
-For more information on the different check images, see the description for
the @option{--check*} options in @ref{Detection options} (this can be disabled
with @option{--continueaftercheck}).
+Since almost no astronomical target has such sharp edges, we need a more
advanced detection methodology.
+NoiseChisel uses a new noise-based paradigm for detection of very extended and
diffuse targets that are drowned deeply in the ocean of noise.
+It was initially introduced in @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa [2015]} and improvements after the first four were published in
@url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}.
+Please take the time to go through these papers to most effectively understand
the need of NoiseChisel and how best to use it.
-The last two extensions of the output are the Sky and its Standard deviation,
see @ref{Sky value} for a complete explanation.
-They are calculated on the tile grid that you defined for NoiseChisel.
-By default these datasets will have the same size as the input, but with all
the pixels in one tile given one value.
-To be more space-efficient (keep only one pixel per tile), you can use the
@option{--oneelempertile} option, see @ref{Tessellation}.
+The name of NoiseChisel is derived from the first thing it does after
thresholding the dataset: to erode it.
+In mathematical morphology, erosion on pixels can be pictured as carving-off
boundary pixels.
+Hence, what NoiseChisel does is similar to what a wood chisel or stone chisel
do.
+It is just not a hardware, but a software.
+In fact, looking at it as a chisel and your dataset as a solid cube of rock
will greatly help in effectively understanding and optimally using it: with
NoiseChisel you literally carve your targets out of the noise.
+Try running it with the @option{--checkdetection} option, and open the
temporary output as a multi-extension cube, to see each step of the carving
process on your input dataset (see @ref{Viewing FITS file contents with DS9 or
TOPCAT}).
-@cindex GNOME
-To inspect any of NoiseChisel's output files, assuming you use SAO DS9, you
can configure your Graphic User Interface (GUI) to open NoiseChisel's output as
a multi-extension data cube.
-This will allow you to flip through the different extensions and visually
inspect the results.
-This process has been described for the GNOME GUI (most common GUI in
GNU/Linux operating systems) in @ref{Viewing FITS file contents with DS9 or
TOPCAT}.
+@cindex Segmentation
+NoiseChisel's primary output is a binary detection map with the same size as
the input but its pixels only have two values: 0 (background) and 1
(foreground).
+Pixels that do not harbor any detected signal (noise) are given a label (or
value) of zero and those with a value of 1 have been identified as hosting
signal.
-NoiseChisel's output configuration options are described in detail below.
+Segmentation is the process of classifying the signal into higher-level
constructs.
+For example, if you have two separate galaxies in one image, NoiseChisel will
give a value of 1 to the pixels of both (each forming an ``island'' of touching
foreground pixels).
+After segmentation, the connected foreground pixels will get separate labels,
enabling you to study them individually.
+NoiseChisel is only focused on detection (separating signal from noise), to
@emph{segment} the signal (into separate galaxies for example), Gnuastro has a
separate specialized program @ref{Segment}.
+NoiseChisel's output can be directly/readily fed into Segment.
-@table @option
-@item --continueaftercheck
-Continue NoiseChisel after any of the options starting with @option{--check}
(see @ref{Detection options}.
-NoiseChisel involves many steps and as a result, there are many checks,
allowing you to inspect the status of the processing.
-The results of each step affect the next steps of processing.
-Therefore, when you want to check the status of the processing at one step,
the time spent to complete NoiseChisel is just wasted/distracting time.
+For more on NoiseChisel's output format and its benefits (especially in
conjunction with @ref{Segment} and later @ref{MakeCatalog}), please see
@url{https://arxiv.org/abs/1611.06387, Akhlaghi [2016]}.
+Just note that when that paper was published, Segment was not yet spun-off
into a separate program, and NoiseChisel done both detection and segmentation.
+
+NoiseChisel's output is designed to be generic enough to be easily used in any
higher-level analysis.
+If your targets are not touching after running NoiseChisel and you are not
interested in their sub-structure, you do not need the Segment program at all.
+You can ask NoiseChisel to find the connected pixels in the output with the
@option{--label} option.
+In this case, the output will not be a binary image any more, the signal will
have counters/labels starting from 1 for each connected group of pixels.
+You can then directly feed NoiseChisel's output into MakeCatalog for
measurements over the detections and the production of a catalog (see
@ref{MakeCatalog}).
+
+Thanks to the published papers mentioned above, there is no need to provide a
more complete introduction to NoiseChisel in this book.
+However, published papers cannot be updated any more, but the software has
evolved/changed.
+The changes since publication are documented in @ref{NoiseChisel changes after
publication}.
+In @ref{Invoking astnoisechisel}, the details of running NoiseChisel and its
options are discussed.
+
+As discussed above, detection is one of the most important steps for your
scientific result.
+It is therefore very important to obtain a good understanding of NoiseChisel
(and afterwards @ref{Segment} and @ref{MakeCatalog}).
+We strongly recommend reviewing two tutorials of @ref{General program usage
tutorial} and @ref{Detecting large extended targets}.
+They are designed to show how to most effectively use NoiseChisel for the
detection of small faint objects and large extended objects.
+In the meantime, they also show the modular principle behind Gnuastro's
programs and how they are built to complement, and build upon, each other.
+
+@ref{General program usage tutorial} culminates in using NoiseChisel to detect
galaxies and use its outputs to find the galaxy colors.
+Defining colors is a very common process in most science-cases.
+Therefore it is also recommended to (patiently) complete that tutorial for
optimal usage of NoiseChisel in conjunction with all the other Gnuastro
programs.
+@ref{Detecting large extended targets} shows you can optimize NoiseChisel's
settings for very extended objects to successfully carve out to signal-to-noise
ratio levels of below 1/10.
+After going through those tutorials, play a little with the settings (in the
order presented in the paper and @ref{Invoking astnoisechisel}) on a dataset
you are familiar with and inspect all the check images (options starting with
@option{--check}) to see the effect of each parameter.
-To encourage easier experimentation with the option values, when you use any
of the NoiseChisel options that start with @option{--check}, NoiseChisel will
abort once its desired extensions have been written.
-With @option{--continueaftercheck} option, you can disable this behavior and
ask NoiseChisel to continue with the rest of the processing, even after the
requested check files are complete.
+Below, in @ref{Invoking astnoisechisel}, we will review NoiseChisel's input,
detection, and output options in @ref{NoiseChisel input}, @ref{Detection
options}, and @ref{NoiseChisel output}.
+If you have used NoiseChisel within your research, please run it with
@option{--cite} to list the papers you should cite and how to acknowledge its
funding sources.
-@item --ignoreblankintiles
-Do Not set the input's blank pixels to blank in the tiled outputs (for
example, Sky and Sky standard deviation extensions of the output).
-This is only applicable when the tiled output has the same size as the input,
in other words, when @option{--oneelempertile} is not called.
+@menu
+* NoiseChisel changes after publication:: Updates since published papers.
+* Invoking astnoisechisel:: Options and arguments for NoiseChisel.
+@end menu
-By default, blank values in the input (commonly on the edges which are outside
the survey/field area) will be set to blank in the tiled outputs also.
-But in other scenarios this default behavior is not desired; for example, if
you have masked something in the input, but want the tiled output under that
also.
+@node NoiseChisel changes after publication, Invoking astnoisechisel,
NoiseChisel, NoiseChisel
+@subsection NoiseChisel changes after publication
-@item -l
-@itemx --label
-Run a connected-components algorithm on the finally detected pixels to
identify which pixels are connected to which.
-By default the main output is a binary dataset with only two values: 0 (for
noise) and 1 (for signal/detections).
-See @ref{NoiseChisel output} for more.
+NoiseChisel was initially introduced in @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa [2015]} and updates after the first four years were
published in @url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}.
+To help in understanding how it works, those papers have many figures showing
every step on multiple mock and real examples.
+We recommended to read these papers for a good understanding of what it does
and how each parameter influences the output.
-The purpose of NoiseChisel is to detect targets that are extended and diffuse,
with outer parts that sink into the noise very gradually (galaxies and stars
for example).
-Since NoiseChisel digs down to extremely low surface brightness values, many
such targets will commonly be detected together as a single large body of
connected pixels.
+However, the papers cannot be updated anymore, but NoiseChisel has evolved
(and will continue to do so): better algorithms or steps have been found and
implemented and some options have been added, removed or changed behavior.
+This book is thus the final and definitive guide to NoiseChisel.
+The aim of this section is to make the transition from the papers above to the
installed version on your system, as smooth as possible with the list below.
+For a more detailed list of changes in each Gnuastro version, please see the
@file{NEWS} file@footnote{The @file{NEWS} file is present in the released
Gnuastro tarball, see @ref{Release tarball}.}.
-To properly separate connected objects, sophisticated segmentation methods are
commonly necessary on NoiseChisel's output.
-Gnuastro has the dedicated @ref{Segment} program for this job.
-Since input images are commonly large and can take a significant volume, the
extra volume necessary to store the labels of the connected components in the
detection map (which will be created with this @option{--label} option, in
32-bit signed integer type) can thus be a major waste of space.
-Since the default output is just a binary dataset, an 8-bit unsigned dataset
is enough.
+@itemize
+@item
+An improved outlier rejection for identifying tiles without any signal has
been implemented in the quantile-threshold phase:
+Prior to version 0.14, outliers were defined globally: the distribution of all
tiles with an acceptable @option{--meanmedqdiff} was inspected and outliers
were found and rejected.
+However, this caused problems when there are strong gradients over the image
(for example, an image prior to flat-fielding, or in the presence of a large
foreground galaxy).
+In these cases, the faint wings of galaxies/stars could be mistakenly
identified as Sky (leaving a footprint of the object on the Sky output) and
wrongly subtracted.
-The binary output will also encourage users to segment the result separately
prior to doing higher-level analysis.
-As an alternative to @option{--label}, if you have the binary detection image,
you can use the @code{connected-components} operator in Gnuastro's Arithmetic
program to identify regions that are connected with each other.
-For example, with this command (assuming NoiseChisel's output is called
@file{nc.fits}):
+It was possible to play with the parameters to correct this for that
particular dataset, but that was frustrating.
+Therefore from version 0.14, instead of finding outliers from the full tile
distribution, we now measure the @emph{slope} of the tile's nearby tiles and
find outliers locally.
+Three options have been added to configure this part of NoiseChisel:
@option{--outliernumngb}, @option{--outliersclip} and @option{--outliersigma}.
+For more on the local outlier-by-distance algorithm and the definition of
@emph{slope} mentioned above, see @ref{Quantifying signal in a tile}.
+In our tests, this gave a much improved estimate of the quantile thresholds
and final Sky values with default values.
+@end itemize
-@example
-$ astarithmetic nc.fits 2 connected-components -hDETECTIONS
-@end example
-@item --rawoutput
-Do Not include the Sky-subtracted input image as the first extension of the
output.
-By default, the Sky-subtracted input is put in the first extension of the
output.
-The next extensions are NoiseChisel's main outputs described above.
-The extra Sky-subtracted input can be convenient in checking NoiseChisel's
output and comparing the detection map with the input: visually see if
everything you expected is detected (reasonable completeness) and that you do
not have too many false detections (reasonable purity).
-This visual inspection is simplified if you use SAO DS9 to view NoiseChisel's
output as a multi-extension data-cube, see @ref{Viewing FITS file contents with
DS9 or TOPCAT}.
-When you are satisfied with your NoiseChisel configuration (therefore you do
not need to check on every run), or you want to archive/transfer the outputs,
or the datasets become large, or you are running NoiseChisel as part of a
pipeline, this Sky-subtracted input image can be a significant burden (take up
a large volume).
-The fact that the input is also noisy, makes it hard to compress it
efficiently.
+@node Invoking astnoisechisel, , NoiseChisel changes after publication,
NoiseChisel
+@subsection Invoking NoiseChisel
-In such cases, this @option{--rawoutput} can be used to avoid the extra
sky-subtracted input in the output.
-It is always possible to easily produce the Sky-subtracted dataset from the
input (assuming it is in extension @code{1} of @file{in.fits}) and the
@code{SKY} extension of NoiseChisel's output (let's call it @file{nc.fits})
with a command like below (assuming NoiseChisel was not run with
@option{--oneelempertile}, see @ref{Tessellation}):
+NoiseChisel will detect signal in noise producing a multi-extension dataset
containing a binary detection map which is the same size as the input.
+Its output can be readily used for input into @ref{Segment}, for higher-level
segmentation, or @ref{MakeCatalog} to do measurements and generate a catalog.
+The executable name is @file{astnoisechisel} with the following general
template
@example
-$ astarithmetic in.fits nc.fits - -h1 -hSKY
+$ astnoisechisel [OPTION ...] InputImage.fits
@end example
-@end table
-@cartouche
@noindent
-@cindex Compression
-@strong{Save space:} with the @option{--rawoutput} and
@option{--oneelempertile}, NoiseChisel's output will only be one binary
detection map and two much smaller arrays with one value per tile.
-Since none of these have noise they can be compressed very effectively
(without any loss of data) with exceptionally high compression ratios.
-This makes it easy to archive, or transfer, NoiseChisel's output even on huge
datasets.
-To compress it with the most efficient method (take up less volume), run the
following command:
+One line examples:
-@cindex GNU Gzip
@example
-$ gzip --best noisechisel_output.fits
-@end example
-
-@noindent
-The resulting @file{.fits.gz} file can then be fed into any of Gnuastro's
programs directly, or viewed in viewers like SAO DS9, without having to
decompress it separately (they will just take a little longer, because they
have to internally decompress it before starting).
-See @ref{NoiseChisel optimization for storage} for an example on a real
dataset.
-@end cartouche
+## Detect signal in input.fits.
+$ astnoisechisel input.fits
+## Inspect all the detection steps after changing a parameter.
+$ astnoisechisel input.fits --qthresh=0.4 --checkdetection
+## Detect signal assuming input has 4 amplifier channels along first
+## dimension and 1 along the second. Also set the regular tile size
+## to 100 along both dimensions:
+$ astnoisechisel --numchannels=4,1 --tilesize=100,100 input.fits
+@end example
+@cindex Gaussian
+@noindent
+If NoiseChisel is to do processing (for example, you do not want to get help,
or see the values to each input parameter), an input image should be provided
with the recognized extensions (see @ref{Arguments}).
+NoiseChisel shares a large set of common operations with other Gnuastro
programs, mainly regarding input/output, general processing steps, and general
operating modes.
+To help in a unified experience between all of Gnuastro's programs, these
operations have the same command-line options, see @ref{Common options} for a
full list/description (they are not repeated here).
+As in all Gnuastro programs, options can also be given to NoiseChisel in
configuration files.
+For a thorough description on Gnuastro's configuration file parsing, please
see @ref{Configuration files}.
+All of NoiseChisel's options with a short description are also always
available on the command-line with the @option{--help} option, see @ref{Getting
help}.
+To inspect the option values without actually running NoiseChisel, append your
command with @option{--printparams} (or @option{-P}).
+NoiseChisel's input image may contain blank elements (see @ref{Blank pixels}).
+Blank elements will be ignored in all steps of NoiseChisel.
+Hence if your dataset has bad pixels which should be masked with a mask image,
please use Gnuastro's @ref{Arithmetic} program (in particular its
@command{where} operator) to convert those pixels to blank pixels before
running NoiseChisel.
+Gnuastro's Arithmetic program has bitwise operators helping you select
specific kinds of bad-pixels when necessary.
+A convolution kernel can also be optionally given.
+If a value (file name) is given to @option{--kernel} on the command-line or in
a configuration file (see @ref{Configuration files}), then that file will be
used to convolve the image prior to thresholding.
+Otherwise a default kernel will be used.
+For a 2D image, the default kernel is a 2D Gaussian with a FWHM of 2 pixels
truncated at 5 times the FWHM.
+This choice of the default kernel is discussed in Section 3.1.1 of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}.
+For a 3D cube, it is a Gaussian with FWHM of 1.5 pixels in the first two
dimensions and 0.75 pixels in the third dimension.
+See @ref{Convolution kernel} for kernel related options.
+Passing @code{none} to @option{--kernel} will disable convolution.
+On the other hand, through the @option{--convolved} option, you may provide an
already convolved image, see descriptions below for more.
+NoiseChisel defines two tessellations over the input (see @ref{Tessellation}).
+This enables it to deal with possible gradients in the input dataset and also
significantly improve speed by processing each tile on different threads
simultaneously.
+Tessellation related options are discussed in @ref{Processing options}.
+In particular, NoiseChisel uses two tessellations (with everything between
them identical except the tile sizes): a fine-grained one with smaller tiles
(used in thresholding and Sky value estimations) and another with larger tiles
which is used for pseudo-detections over non-detected regions of the image.
+The common Tessellation options described in @ref{Processing options} define
all parameters of both tessellations.
+The large tile size for the latter tessellation is set through the
@option{--largetilesize} option.
+To inspect the tessellations on your input dataset, run NoiseChisel with
@option{--checktiles}.
+@cartouche
+@noindent
+@strong{Usage TIP:} Frequently use the options starting with @option{--check}.
+Since the noise properties differ between different datasets, you can often
play with the parameters/options for a better result than the default
parameters.
+You can start with @option{--checkdetection} for the main steps.
+For the full list of NoiseChisel's checking options please run:
+@example
+$ astnoisechisel --help | grep check
+@end example
+@end cartouche
+@cartouche
+@noindent
+@strong{Not detecting wings of bright galaxies:} In such cases, probably the
best solution is to increase @option{--outliernumngb} (to reject tiles that are
affected by very flat diffuse signal).
+For more, see @ref{Quantifying signal in a tile}.
+@end cartouche
-@node Segment, MakeCatalog, NoiseChisel, Data analysis
-@section Segment
+When working on 3D datacubes, the tessellation options need three values and
updating them every time can be annoying/buggy.
+To simplify the job, NoiseChisel also installs a @file{astnoisechisel-3d.conf}
configuration file (see @ref{Configuration files}).
+You can use this for default values on datacubes.
+For example, if you installed Gnuastro with the prefix @file{/usr/local} (the
default location, see @ref{Installation directory}), you can benefit from this
configuration file by running NoiseChisel like the example below.
-Once signal is separated from noise (for example, with @ref{NoiseChisel}), you
have a binary dataset: each pixel is either signal (1) or noise (0).
-Signal (for example, every galaxy in your image) has been ``detected'', but
all detections have a label of 1.
-Therefore while we know which pixels contain signal, we still cannot find out
how many galaxies they contain or which detected pixels correspond to which
galaxy.
-At the lowest (most generic) level, detection is a kind of segmentation
(segmenting the whole dataset into signal and noise, see @ref{NoiseChisel}).
-Here, we will define segmentation only on signal: to separate sub-structure
within the detections.
+@example
+$ astnoisechisel cube.fits \
+ --config=/usr/local/etc/astnoisechisel-3d.conf
+@end example
-@cindex Connected component labeling
-If the targets are clearly separated, or their detected regions are not
touching, a simple connected
components@footnote{@url{https://en.wikipedia.org/wiki/Connected-component_labeling}}
algorithm (very basic segmentation) is enough to separate the regions that are
touching/connected.
-This is such a basic and simple form of segmentation that Gnuastro's
Arithmetic program has an operator for it: see @code{connected-components} in
@ref{Arithmetic operators}.
-Assuming the binary dataset is called @file{binary.fits}, you can use it with
a command like this:
+@cindex Shell alias
+@cindex Alias (shell)
+@cindex Shell startup
+@cindex Startup, shell
+To further simplify the process, you can define a shell alias in any startup
file (for example, @file{~/.bashrc}, see @ref{Installation directory}).
+Assuming that you installed Gnuastro in @file{/usr/local}, you can add this
line to the startup file (you may put it all in one line, it is broken into two
lines here for fitting within page limits).
@example
-$ astarithmetic binary.fits 2 connected-components
+alias astnoisechisel-3d="astnoisechisel \
+ --config=/usr/local/etc/astnoisechisel-3d.conf"
@end example
@noindent
-You can even do a very basic detection (a threshold, say at value
-@code{100}) @emph{and} segmentation in Arithmetic with a single command
-like below:
+Using this alias, you can call NoiseChisel with the name
@command{astnoisechisel-3d} (instead of @command{astnoisechisel}).
+It will automatically load the 3D specific configuration file first, and then
parse any other arguments, options or configuration files.
+You can change the default values in this 3D configuration file by calling
them on the command-line as you do with
@command{astnoisechisel}@footnote{Recall that for single-invocation options,
the last command-line invocation takes precedence over all previous invocations
(including those in the 3D configuration file).
+See the description of @option{--config} in @ref{Operating mode options}.}.
+For example:
@example
-$ astarithmetic in.fits 100 gt 2 connected-components
+$ astnoisechisel-3d --numchannels=3,3,1 cube.fits
@end example
-However, in most astronomical situations our targets are not nicely separated
or have a sharp boundary/edge (for a threshold to suffice): they touch (for
example, merging galaxies), or are simply in the same line-of-sight (which is
much more common).
-This causes their images to overlap.
+In the sections below, NoiseChisel's options are classified into three general
classes to help in easy navigation.
+@ref{NoiseChisel input} mainly discusses the options relating to input and
those that are shared in both detection and segmentation.
+Options to configure the detection are described in @ref{Detection options}
and @ref{Segmentation options} we discuss how you can fine-tune the
segmentation of the detections.
+Finally, in @ref{NoiseChisel output} the format of NoiseChisel's output is
discussed.
+The order of options here follow the same logical order that the respective
action takes place within NoiseChisel (note that the output of @option{--help}
is sorted alphabetically).
-In particular, when you do your detection with NoiseChisel, you will detect
signal to very low surface brightness limits: deep into the faint wings of
galaxies or bright stars (which can extend very far and irregularly from their
center).
-Therefore, it often happens that several galaxies are detected as one large
detection.
-Since they are touching, a simple connected components algorithm will not
suffice.
-It is therefore necessary to do a more sophisticated segmentation and break up
the detected pixels (even those that are touching) into multiple target objects
as accurately as possible.
+Below, we will discuss NoiseChisel's options, classified into two general
classes, to help in easy navigation.
+@ref{NoiseChisel input} mainly discusses the basic options relating to inputs
and prior to the detection process detection.
+Afterwards, @ref{Detection options} fully describes every configuration
parameter (option) related to detection and how they affect the final result.
+The order of options in this section follow the logical order within
NoiseChisel.
+On first reading (while you are still new to NoiseChisel), it is therefore
strongly recommended to read the options in the given order below.
+The output of @option{--printparams} (or @option{-P}) also has this order.
+However, the output of @option{--help} is sorted alphabetically.
+Finally, in @ref{NoiseChisel output} the format of NoiseChisel's output is
discussed.
-Segment will use a detection map and its corresponding dataset to find
sub-structure over the detected areas and use them for its segmentation.
-Until Gnuastro version 0.6 (released in 2018), Segment was part of
@ref{NoiseChisel}.
-Therefore, similar to NoiseChisel, the best place to start reading about
Segment and understanding what it does (with many illustrative figures) is
Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
[2015]}, and continue with @url{https://arxiv.org/abs/1909.11230, Akhlaghi
[2019]}.
-@cindex river
-@cindex Watershed algorithm
-As a summary, Segment first finds true @emph{clump}s over the detections.
-Clumps are associated with local maxima/minima@footnote{By default the maximum
is used as the first clump pixel, to define clumps based on local minima, use
the @option{--minima} option.} and extend over the neighboring pixels until
they reach a local minimum/maximum (@emph{river}/@emph{watershed}).
-By default, Segment will use the distribution of clump signal-to-noise ratios
over the undetected regions as reference to find ``true'' clumps over the
detections.
-Using the undetected regions can be disabled by directly giving a
signal-to-noise ratio to @option{--clumpsnthresh}.
+@menu
+* NoiseChisel input:: NoiseChisel's input options.
+* Detection options:: Configure detection in NoiseChisel.
+* NoiseChisel output:: NoiseChisel's output options and format.
+@end menu
-The true clumps are then grown to a certain threshold over the detections.
-Based on the strength of the connections (rivers/watersheds) between the grown
clumps, they are considered parts of one @emph{object} or as separate
@emph{object}s.
-See Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and
Ichikawa [2015]} for more.
-Segment's main output are thus two labeled datasets: 1) clumps, and 2) objects.
-See @ref{Segment output} for more.
+@node NoiseChisel input, Detection options, Invoking astnoisechisel, Invoking
astnoisechisel
+@subsubsection NoiseChisel input
-To start learning about Segment, especially in relation to detection
(@ref{NoiseChisel}) and measurement (@ref{MakeCatalog}), the recommended
references are @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
[2015]}, @url{https://arxiv.org/abs/1611.06387, Akhlaghi [2016]} and
@url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}.
-If you have used Segment within your research, please run it with
@option{--cite} to list the papers you should cite and how to acknowledge its
funding sources.
+The options here can be used to configure the inputs and output of
NoiseChisel, along with some general processing options.
+Recall that you can always see the full list of Gnuastro's options with the
@option{--help} (see @ref{Getting help}), or @option{--printparams} (or
@option{-P}) to see their values (see @ref{Operating mode options}).
-Those papers cannot be updated any more but the software will evolve.
-For example, Segment became a separate program (from NoiseChisel) in 2018
(after those papers were published).
-Therefore this book is the definitive reference.
-@c To help in the transition from those papers to the software you are using,
see @ref{Segment changes after publication}.
-Finally, in @ref{Invoking astsegment}, we will discuss Segment's inputs,
outputs and configuration options.
+@table @option
+
+@item -k FITS
+@itemx --kernel=FITS
+File name of kernel to smooth the image before applying the threshold, see
@ref{Convolution kernel}.
+If no convolution is needed, give this option a value of @option{none}.
+
+The first step of NoiseChisel is to convolve/smooth the image and use the
convolved image in multiple steps including the finding and applying of the
quantile threshold (see @option{--qthresh}).
+The @option{--kernel} option is not mandatory.
+If not called, for a 2D, image a 2D Gaussian profile with a FWHM of 2 pixels
truncated at 5 times the FWHM is used.
+This choice of the default kernel is discussed in Section 3.1.1 of Akhlaghi
and Ichikawa [2015].
+
+For a 3D cube, when no file name is given to @option{--kernel}, a Gaussian
with FWHM of 1.5 pixels in the first two dimensions and 0.75 pixels in the
third dimension will be used.
+The reason for this particular configuration is that commonly in astronomical
applications, 3D datasets do not have the same nature in all three dimensions,
commonly the first two dimensions are spatial (RA and Dec) while the third is
spectral (for example, wavelength).
+The samplings are also different, in the default case, the spatial sampling is
assumed to be larger than the spectral sampling, hence a wider FWHM in the
spatial directions, see @ref{Sampling theorem}.
+
+You can use MakeProfiles to build a kernel with any of its recognized profile
types and parameters.
+For more details, please see @ref{MakeProfiles output dataset}.
+For example, the command below will make a Moffat kernel (with
@mymath{\beta=2.8}) with FWHM of 2 pixels truncated at 10 times the FWHM.
+
+@example
+$ astmkprof --oversample=1 --kernel=moffat,2,2.8,10
+@end example
+
+Since convolution can be the slowest step of NoiseChisel, for large datasets,
you can convolve the image once with Gnuastro's Convolve (see @ref{Convolve}),
and use the @option{--convolved} option to feed it directly to NoiseChisel.
+This can help getting faster results when you are playing/testing the
higher-level options.
+
+@item --khdu=STR
+HDU containing the kernel in the file given to the @option{--kernel}
+option.
+@item --convolved=FITS
+Use this file as the convolved image and do not do convolution (ignore
@option{--kernel}).
+NoiseChisel will just check the size of the given dataset is the same as the
input's size.
+If a wrong image (with the same size) is given to this option, the results
(errors, bugs, etc.) are unpredictable.
+So please use this option with care and in a highly controlled environment,
for example, in the scenario discussed below.
-@menu
-* Invoking astsegment:: Inputs, outputs and options to Segment
-@end menu
+In almost all situations, as the input gets larger, the single most CPU (and
time) consuming step in NoiseChisel (and other programs that need a convolved
image) is convolution.
+Therefore minimizing the number of convolutions can save a significant amount
of time in some scenarios.
+One such scenario is when you want to segment NoiseChisel's detections using
the same kernel (with @ref{Segment}, which also supports this
@option{--convolved} option).
+This scenario would require two convolutions of the same dataset: once by
NoiseChisel and once by Segment.
+Using this option in both programs, only one convolution (prior to running
NoiseChisel) is enough.
-@c @node Segment changes after publication, Invoking astsegment, Segment,
Segment
-@c @subsection Segment changes after publication
+Another common scenario where this option can be convenient is when you are
testing NoiseChisel (or Segment) for the best parameters.
+You have to run NoiseChisel multiple times and see the effect of each change.
+However, once you are happy with the kernel, re-convolving the input on every
change of higher-level parameters will greatly hinder, or discourage, further
testing.
+With this option, you can convolve the input image with your chosen kernel
once before running NoiseChisel, then feed it to NoiseChisel on each test run
and thus save valuable time for better/more tests.
-@c Segment's main algorithm and working strategy were initially defined and
introduced in Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa [2015]} and @url{https://arxiv.org/abs/1909.11230, Akhlaghi
[2019]}.
-@c It is strongly recommended to read those papers for a good understanding of
what Segment does, how it relates to detection, and how each parameter
influences the output.
-@c They have many figures showing every step on multiple mock and real
examples.
+To build your desired convolution kernel, you can use @ref{MakeProfiles}.
+To convolve the image with a given kernel you can use @ref{Convolve}.
+Spatial domain convolution is mandatory: in the frequency domain, blank pixels
(if present) will cover the whole image and gradients will appear on the edges,
see @ref{Spatial vs. Frequency domain}.
-@c However, the papers cannot be updated anymore, but Segment has evolved (and
will continue to do so): better algorithms or steps have been (and will be)
found.
-@c This book is thus the final and definitive guide to Segment.
-@c The aim of this section is to make the transition from the paper to your
installed version, as smooth as possible through the list below.
-@c For a more detailed list of changes in previous Gnuastro releases/versions,
please follow the @file{NEWS} file@footnote{The @file{NEWS} file is present in
the released Gnuastro tarball, see @ref{Release tarball}.}.
+Below you can see an example of the second scenario: you want to see how
variation of the growth level (through the @option{--detgrowquant} option) will
affect the final result.
+Recall that you can ignore all the extra spaces, new lines, and backslash's
(`@code{\}') if you are typing in the terminal.
+In a shell script, remove the @code{$} signs at the start of the lines.
-@node Invoking astsegment, , Segment, Segment
-@subsection Invoking Segment
+@example
+## Make the kernel to convolve with.
+$ astmkprof --oversample=1 --kernel=gaussian,2,5
-Segment will identify substructure within the detected regions of an input
image.
-Segment's output labels can be directly used for measurements (for example,
with @ref{MakeCatalog}).
-The executable name is @file{astsegment} with the following general template
+## Convolve the input with the given kernel.
+$ astconvolve input.fits --kernel=kernel.fits \
+ --domain=spatial --output=convolved.fits
-@example
-$ astsegment [OPTION ...] InputImage.fits
+## Run NoiseChisel with seven growth quantile values.
+$ for g in 60 65 70 75 80 85 90; do \
+ astnoisechisel input.fits --convolved=convolved.fits \
+ --detgrowquant=0.$g --output=$g.fits; \
+ done
@end example
-@noindent
-One line examples:
-@example
-## Segment NoiseChisel's detected regions.
-$ astsegment default-noisechisel-output.fits
-## Use a hand-input S/N value for keeping true clumps
-## (avoid finding the S/N using the undetected regions).
-$ astsegment nc-out.fits --clumpsnthresh=10
+@item --chdu=STR
+The HDU/extension containing the convolved image in the file given to
@option{--convolved}.
-## Inspect all the segmentation steps after changing a parameter.
-$ astsegment input.fits --snquant=0.9 --checksegmentaion
+@item -w FITS
+@itemx --widekernel=FITS
+File name of a wider kernel to use in estimating the difference of the mode
and median in a tile (this difference is used to identify the significance of
signal in that tile, see @ref{Quantifying signal in a tile}).
+As displayed in Figure 4 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa [2015]}, a wider kernel will help in identifying the skewness
caused by data in noise.
+The image that is convolved with this kernel is @emph{only} used for this
purpose.
+Once the mode is found to be sufficiently close to the median, the quantile
threshold is found on the image convolved with the sharper kernel
(@option{--kernel}), see @option{--qthresh}).
-## Use the fixed value of 0.01 for the input's Sky standard deviation
-## (in the units of the input), and assume all the pixels are a
-## detection (for example, a large structure extending over the whole
-## image), and only keep clumps with S/N>10 as true clumps.
-$ astsegment in.fits --std=0.01 --detection=all --clumpsnthresh=10
-@end example
+Since convolution will significantly slow down the processing, this feature is
optional.
+When it is not given, the image that is convolved with @option{--kernel} will
be used to identify good tiles @emph{and} apply the quantile threshold.
+This option is mainly useful in conditions were you have a very large,
extended, diffuse signal that is still present in the usable tiles when using
@option{--kernel}.
+See @ref{Detecting large extended targets} for a practical demonstration on
how to inspect the tiles used in identifying the quantile threshold.
-@cindex Gaussian
-@noindent
-If Segment is to do processing (for example, you do not want to get help, or
see the values of each option), at least one input dataset is necessary along
with detection and error information, either as separate datasets (per-pixel)
or fixed values, see @ref{Segment input}.
-Segment shares a large set of common operations with other Gnuastro programs,
mainly regarding input/output, general processing steps, and general operating
modes.
-To help in a unified experience between all of Gnuastro's programs, these
common operations have the same names and defined in @ref{Common options}.
+@item --whdu=STR
+HDU containing the kernel file given to the @option{--widekernel} option.
-As in all Gnuastro programs, options can also be given to Segment in
configuration files.
-For a thorough description of Gnuastro's configuration file parsing, please
see @ref{Configuration files}.
-All of Segment's options with a short description are also always available on
the command-line with the @option{--help} option, see @ref{Getting help}.
-To inspect the option values without actually running Segment, append your
command with @option{--printparams} (or @option{-P}).
+@item -L INT[,INT]
+@itemx --largetilesize=INT[,INT]
+The size of each tile for the tessellation with the larger tile sizes.
+Except for the tile size, all the other parameters for this tessellation are
taken from the common options described in @ref{Processing options}.
+The format is identical to that of the @option{--tilesize} option that is
discussed in that section.
+@end table
-To help in easy navigation between Segment's options, they are separately
discussed in the three sub-sections below: @ref{Segment input} discusses how
you can customize the inputs to Segment.
-@ref{Segmentation options} is devoted to options specific to the high-level
segmentation process.
-Finally, in @ref{Segment output}, we will discuss options that affect
Segment's output.
+@node Detection options, NoiseChisel output, NoiseChisel input, Invoking
astnoisechisel
+@subsubsection Detection options
-@menu
-* Segment input:: Input files and options.
-* Segmentation options:: Parameters of the segmentation process.
-* Segment output:: Outputs of Segment
-@end menu
+Detection is the process of separating the pixels in the image into two
groups: 1) Signal, and 2) Noise.
+Through the parameters below, you can customize the detection process in
NoiseChisel.
+Recall that you can always see the full list of NoiseChisel's options with the
@option{--help} (see @ref{Getting help}), or @option{--printparams} (or
@option{-P}) to see their values (see @ref{Operating mode options}).
-@node Segment input, Segmentation options, Invoking astsegment, Invoking
astsegment
-@subsubsection Segment input
+@table @option
-Besides the input dataset (for example, astronomical image), Segment also
needs to know the Sky standard deviation and the regions of the dataset that it
should segment.
-The values dataset is assumed to be Sky subtracted by default.
-If it is not, you can ask Segment to subtract the Sky internally by calling
@option{--sky}.
-For the rest of this discussion, we will assume it is already sky subtracted.
+@item -Q FLT
+@itemx --meanmedqdiff=FLT
+The maximum acceptable distance between the quantiles of the mean and median
in each tile, see @ref{Quantifying signal in a tile}.
+The quantile threshold estimates are measured on tiles where the quantiles of
their mean and median are less distant than the value given to this option.
+For example, @option{--meanmedqdiff=0.01} means that only tiles where the
mean's quantile is between 0.49 and 0.51 (recall that the median's quantile is
0.5) will be used.
-The Sky and its standard deviation can be a single value (to be used for the
whole dataset) or a separate dataset (for a separate value per pixel).
-If a dataset is used for the Sky and its standard deviation, they must either
be the size of the input image, or have a single value per tile (generated with
@option{--oneelempertile}, see @ref{Processing options} and @ref{Tessellation}).
+@item -a INT
+@itemx --outliernumngb=INT
+Number of neighboring tiles to use for outlier rejection (mostly the wings of
bright stars or galaxies).
+For optimal detection of the wings of bright stars or galaxies, this is
@strong{the most important} option in NoiseChisel.
+This is because the extended wings of bright galaxies or stars (the PSF) can
become flat over the tile.
+In this case, they will satisfy the @option{--meanmedqdiff} condition and pass
that step.
+Therefore, to correctly identify such bad tiles, we need to look at the
neighboring nearby tiles.
+A tile that is on the wing of a bright galaxy/star will clearly be an outlier
when looking at the neighbors.
+For more on the details of the outlier rejection algorithm, see the latter
half of @ref{Quantifying signal in a tile}.
+If this option is given a value of zero, no outlier rejection will take place.
-The detected regions/pixels can be specified as a detection map (for example,
see @ref{NoiseChisel output}).
-If @option{--detection=all}, Segment will not read any detection map and
assume the whole input is a single detection.
-For example, when the dataset is fully covered by a large nearby
galaxy/globular cluster.
+@item --outliersclip=FLT,FLT
+@mymath{\sigma}-clipping parameters for the outlier rejection of the quantile
threshold.
+The format of the given values is similar to @option{--sigmaclip} below.
+In NoiseChisel, outlier rejection on tiles is used when identifying the
quantile thresholds (@option{--qthresh}, @option{--noerodequant}, and
@option{detgrowquant}).
-When dataset are to be used for any of the inputs, Segment will assume they
are multiple extensions of a single file by default (when @option{--std} or
@option{--detection} are not called).
-For example, NoiseChisel's default output @ref{NoiseChisel output}.
-When the Sky-subtracted values are in one file, and the detection and Sky
standard deviation are in another, you just need to use @option{--detection}:
in the absence of @option{--std}, Segment will look for both the detection
labels and Sky standard deviation in the file given to @option{--detection}.
-Ultimately, if all three are in separate files, you need to call both
@option{--detection} and @option{--std}.
+Outlier rejection is useful when the dataset contains a large and diffuse
(almost flat within each tile) signal.
+The flatness of the profile will cause it to successfully pass the mean-median
quantile difference test, so we will need to use the distribution of successful
tiles for removing these false positives.
+For more, see the latter half of @ref{Quantifying signal in a tile}.
-The extensions of the three mandatory inputs can be specified with
@option{--hdu}, @option{--dhdu}, and @option{--stdhdu}.
-For a full discussion on what to give to these options, see the description of
@option{--hdu} in @ref{Input output options}.
-To see their default values (along with all the other options), run Segment
with the @option{--printparams} (or @option{-P}) option.
-Just recall that in the absence of @option{--detection} and @option{--std},
all three are assumed to be in the same file.
-If you only want to see Segment's default values for HDUs on your system, run
this command:
+@item --outliersigma=FLT
+Multiple of sigma to define an outlier.
+If this option is given a value of zero, no outlier rejection will take place.
+For more see @option{--outliersclip} and the latter half of @ref{Quantifying
signal in a tile}.
-@example
-$ astsegment -P | grep hdu
-@end example
+@item -t FLT
+@itemx --qthresh=FLT
+The quantile threshold to apply to the convolved image.
+The detection process begins with applying a quantile threshold to each of the
tiles in the small tessellation.
+The quantile is only calculated for tiles that do not have any significant
signal within them, see @ref{Quantifying signal in a tile}.
+Interpolation is then used to give a value to the unsuccessful tiles and it is
finally smoothed.
-By default Segment will convolve the input with a kernel to improve the
signal-to-noise ratio of true peaks.
-If you already have the convolved input dataset, you can pass it directly to
Segment for faster processing (using the @option{--convolved} and
@option{--chdu} options).
-Just do not forget that the convolved image must also be Sky-subtracted before
calling Segment.
-If a value/file is given to @option{--sky}, the convolved values will also be
Sky subtracted internally.
-Alternatively, if you prefer to give a kernel (with @option{--kernel} and
@option{--khdu}), Segment can do the convolution internally.
-To disable convolution, use @option{--kernel=none}.
+@cindex Quantile
+@cindex Binary image
+@cindex Foreground pixels
+@cindex Background pixels
+The quantile value is a floating point value between 0 and 1.
+Assume that we have sorted the @mymath{N} data elements of a distribution (the
pixels in each mesh on the convolved image).
+The quantile (@mymath{q}) of this distribution is the value of the element
with an index of (the nearest integer to) @mymath{q\times{N}} in the sorted
data set.
+After thresholding is complete, we will have a binary (two valued) image.
+The pixels above the threshold are known as foreground pixels (have a value of
1) while those which lie below the threshold are known as background (have a
value of 0).
-@table @option
+@item --smoothwidth=INT
+Width of flat kernel used to smooth the interpolated quantile thresholds, see
@option{--qthresh} for more.
-@item --sky=STR/FLT
-The Sky value(s) to subtract from the input.
-This option can either be given a constant number or a file name containing a
dataset (multiple values, per pixel or per tile).
-By default, Segment will assume the input dataset is Sky subtracted, so this
option is not mandatory.
+@cindex NaN
+@item --checkqthresh
+Check the quantile threshold values on the mesh grid.
+A multi-extension FITS file, suffixed with @file{_qthresh.fits} will be
created showing each step of how the final quantile threshold is found.
+With this option, NoiseChisel will abort as soon as quantile estimation has
been completed, allowing you to inspect the steps leading to the final quantile
threshold, this can be disabled with @option{--continueaftercheck}.
+By default the output will have the same pixel size as the input, but with the
@option{--oneelempertile} option, only one pixel will be used for each tile
(see @ref{Processing options}).
-If the value cannot be read as a number, it is assumed to be a file name.
-When the value is a file, the extension can be specified with
@option{--skyhdu}.
-When it is not a single number, the given dataset must either have the same
size as the output or the same size as the tessellation (so there is one pixel
per tile, see @ref{Tessellation}).
+The key things to remember are:
+@itemize
+@item
+The measurements to find the thresholds are done on tiles that cover the whole
image in a tessellation.
+Recall that you can set the size of tiles with @option{--tilesize} and check
them with @option{--checktiles}.
+Therefore except for the first and last extensions, the rest only show tiles.
+@item
+NoiseChisel ultimately has three thresholds: the quantile threshold (that you
set with @option{--qthresh}), the no-erode quantile (set with
@option{--noerodequant}) and the growth quantile (set with
@option{--detgrowquant}).
+Therefore for each step, we have three extensions.
+@end itemize
-When this option is given, its value(s) will be subtracted from the input and
the (optional) convolved dataset (given to @option{--convolved}) prior to
starting the segmentation process.
+The output file will have the following extensions.
+Below, the extensions are put in the same order as you see in the file, with
their name.
-@item --skyhdu=STR/INT
-The HDU/extension containing the Sky values.
-This is mandatory when the value given to @option{--sky} is not a number.
-Please see the description of @option{--hdu} in @ref{Input output options} for
the different ways you can identify a special extension.
+@table @code
+@item CONVOLVED
+This is the input image after convolution with the kernel (which is a FWHM=2
Gaussian by default, but you can change with @option{--kernel}).
+Recall that the thresholds are defined on the convolved image.
-@item --std=STR/FLT
-The Sky standard deviation value(s) corresponding to the input.
-The value can either be a constant number or a file name containing a dataset
(multiple values, per pixel or per tile).
-The Sky standard deviation is mandatory for Segment to operate.
+@item QTHRESH_ERODE
+@itemx QTHRESH_NOERODE
+@itemx QTHRESH_EXPAND
+In these three extensions, the tiles that have a quantile-of-mean more/less
than 0.5 (quantile of median) @mymath{\pm d} are set to NaN (@mymath{d} is the
value given to @option{--meanmedqdiff}, see @ref{Quantifying signal in a tile}).
+Therefore the non-NaN tiles that you see here are the tiles where there is no
significant skewness (changing signal) within that tile.
+The only differing thing between the three extensions is the values of the
non-NaN tiles.
+These values will be used to construct the final threshold map over the whole
image.
-If the value cannot be read as a number, it is assumed to be a file name.
-When the value is a file, the extension can be specified with
@option{--skyhdu}.
-When it is not a single number, the given dataset must either have the same
size as the output or the same size as the tessellation (so there is one pixel
per tile, see @ref{Tessellation}).
+@item VALUE1_NO_OUTLIER
+@itemx VALUE2_NO_OUTLIER
+@itemx VALUE3_NO_OUTLIER
+All outlier tiles have been masked.
+The reason for removing outliers is that the quantile-of-mean is only
sensitive to signal that varies on a scale that is smaller than the tile size.
+Therefore the extended wings of large galaxies or bright stars (which vary on
scales much larger than the tile size) will pass that test.
+As described in @ref{Quantifying signal in a tile} outlier rejection is
customized through @option{--outliernumngb}, @option{--outliersclip} and
@option{--outliersigma}.
-When this option is not called, Segment will assume the standard deviation is
a dataset and in a HDU/extension (@option{--stdhdu}) of another one of the
input file(s).
-If a file is given to @option{--detection}, it will assume that file contains
the standard deviation dataset, otherwise, it will look into input filename
(the main argument, without any option).
+@item THRESH1_INTERP
+@itemx THRESH2_INTERP
+@itemx THRESH3_INTERP
+Using the successful values that remain after the previous step, give values
to all (interpolate) the tiles in the image.
+The interpolation is done using the nearest-neighbor method: for each tile,
the N nearest neighbors are found and the median of their values is used to
fill it.
+You can set the value of N through the @option{--interpnumngb} option.
+
+@item THRESH1_SMOOTH
+@itemx THRESH2_SMOOTH
+@itemx THRESH3_SMOOTH
+Smooth the interpolated image to remove the strong differences between
touching tiles.
+Because we used the median value of the N nearest neighbors in the previous
step, there can be strong discontinuities on the edges of tiles (which can
directly show in the image after applying the threshold).
+The scale of the smoothing (number of nearby tiles to smooth with) is set with
the @option{--smoothwidth} option.
+
+@item QTHRESH-APPLIED
+The pixels in this image can only have three values:
-@item --stdhdu=INT/STR
-The HDU/extension containing the Sky standard deviation values, when the value
given to @option{--std} is a file name.
-Please see the description of @option{--hdu} in @ref{Input output options} for
the different ways you can identify a special extension.
+@table @code
+@item 0
+These pixels had a value below the quantile threshold.
+@item 1
+These pixels had a value above the quantile threshold, but below the threshold
for no erosion.
+Therefore in the next step, NoiseChisel will erode (set them to 0) these
pixels if they are touching a 0-valued pixel.
+@item 2
+These pixels had a value above the no-erosion threshold.
+So NoiseChisel will not erode these pixels, it will only apply Opening to them
afterwards.
+Recall that this was done to avoid loosing sharp point-sources (like stars in
space-based imaging).
+@end table
+@end table
-@item --variance
-The input Sky standard deviation value/dataset is actually variance.
-When this option is called, the square root of input Sky standard deviation
(see @option{--std}) is used internally, not its raw value(s).
+@item --blankasforeground
+In the erosion and opening steps below, treat blank elements as foreground
(regions above the threshold).
+By default, blank elements in the dataset are considered to be background, so
if a foreground pixel is touching it, it will be eroded.
+This option is irrelevant if the datasets contains no blank elements.
-@item -d FITS
-@itemx --detection=FITS
-Detection map to use for segmentation.
-If given a value of @option{all}, Segment will assume the whole dataset must
be segmented, see below.
-If a detection map is given, the extension can be specified with
@option{--dhdu}.
-If not given, Segment will assume the desired HDU/extension is in the main
input argument (input file specified with no option).
+When there are many blank elements in the dataset, treating them as foreground
will systematically erode their regions less, therefore systematically creating
more false positives.
+So use this option (when blank values are present) with care.
-The final segmentation (clumps or objects) will only be over the non-zero
pixels of this detection map.
-The dataset must have the same size as the input image.
-Only datasets with an integer type are acceptable for the labeled image, see
@ref{Numeric data types}.
-If your detection map only has integer values, but it is stored in a floating
point container, you can use Gnuastro's Arithmetic program (see
@ref{Arithmetic}) to convert it to an integer container, like the example below:
+@item -e INT
+@itemx --erode=INT
+@cindex Erosion
+The number of erosions to apply to the binary thresholded image.
+Erosion is simply the process of flipping (from 1 to 0) any of the foreground
pixels that neighbor a background pixel.
+In a 2D image, there are two kinds of neighbors, 4-connected and 8-connected
neighbors.
+In a 3D dataset, there are three: 6-connected, 18-connected, and 26-connected.
+You can specify which class of neighbors should be used for erosion with the
@option{--erodengb} option, see below.
-@example
-$ astarithmetic float.fits int32 --output=int.fits
-@end example
+Erosion has the effect of shrinking the foreground pixels.
+To put it another way, it expands the holes.
+This is a founding principle in NoiseChisel: it exploits the fact that with
very low thresholds, the holes in the very low surface brightness regions of an
image will be smaller than regions that have no signal.
+Therefore by expanding those holes, we are able to separate the regions
harboring signal.
-It may happen that the whole input dataset is covered by signal, for example,
when working on parts of the Andromeda galaxy, or nearby globular clusters
(that cover the whole field of view).
-In such cases, segmentation is necessary over the complete dataset, not just
specific regions (detections).
-By default Segment will first use the undetected regions as a reference to
find the proper signal-to-noise ratio of ``true'' clumps (give a purity level
specified with @option{--snquant}).
-Therefore, in such scenarios you also need to manually give a ``true'' clump
signal-to-noise ratio with the @option{--clumpsnthresh} option to disable
looking into the undetected regions, see @ref{Segmentation options}.
-In such cases, is possible to make a detection map that only has the value
@code{1} for all pixels (for example, using @ref{Arithmetic}), but for
convenience, you can also use @option{--detection=all}.
+@item --erodengb=INT
+The type of neighborhood (structuring element) used in erosion, see
@option{--erode} for an explanation on erosion.
+If the input is a 2D image, only two integer values are acceptable: 4 or 8.
+For a 3D input datacube, the acceptable values are: 6, 18 and 26.
-@item --dhdu
-The HDU/extension containing the detection map given to @option{--detection}.
-Please see the description of @option{--hdu} in @ref{Input output options} for
the different ways you can identify a special extension.
+In 2D 4-connectivity, the neighbors of a pixel are defined as the four pixels
on the top, bottom, right and left of a pixel that share an edge with it.
+The 8-connected neighbors on the other hand include the 4-connected neighbors
along with the other 4 pixels that share a corner with this pixel.
+See Figure 6 (a) and (b) in Akhlaghi and Ichikawa (2015) for a demonstration.
+A similar argument applies to 3D datacubes.
-@item -k FITS
-@itemx --kernel=FITS
-The name of file containing kernel that will be used to convolve the input
image.
-The usage of this option is identical to NoiseChisel's @option{--kernel}
option (@ref{NoiseChisel input}).
-Please see the descriptions there for more.
-To disable convolution, you can give it a value of @option{none}.
+@item --noerodequant
+Pure erosion is going to carve off sharp and small objects completely out of
the detected regions.
+This option can be used to avoid missing such sharp and small objects (which
have significant pixels, but not over a large area).
+All pixels with a value larger than the significance level specified by this
option will not be eroded during the erosion step above.
+However, they will undergo the erosion and dilation of the opening step below.
-@item --khdu
-The HDU/extension containing the kernel used for convolution.
-For acceptable values, please see the description of @option{--hdu} in
@ref{Input output options}.
+Like the @option{--qthresh} option, the significance level is determined using
the quantile (a value between 0 and 1).
+Just as a reminder, in the normal distribution, @mymath{1\sigma},
@mymath{1.5\sigma}, and @mymath{2\sigma} are approximately on the 0.84, 0.93,
and 0.98 quantiles.
-@item --convolved=FITS
-The convolved image's file name to avoid internal convolution by Segment.
-The usage of this option is identical to NoiseChisel's @option{--convolved}
option.
-Please see @ref{NoiseChisel input} for a thorough discussion of the usefulness
and best practices of using this option.
+@item -p INT
+@itemx --opening=INT
+Depth of opening to be applied to the eroded binary image.
+Opening is a composite operation.
+When opening a binary image with a depth of @mymath{n}, @mymath{n} erosions
(explained in @option{--erode}) are followed by @mymath{n} dilations.
+Simply put, dilation is the inverse of erosion.
+When dilating an image any background pixel is flipped (from 0 to 1) to become
a foreground pixel.
+Dilation has the effect of fattening the foreground.
+Note that in NoiseChisel, the erosion which is part of opening is independent
of the initial erosion that is done on the thresholded image (explained in
@option{--erode}).
+The structuring element for the opening can be specified with the
@option{--openingngb} option.
+Opening has the effect of removing the thin foreground connections (mostly
noise) between separate foreground `islands' (detections) thereby completely
isolating them.
+Once opening is complete, we have @emph{initial} detections.
-If you want to use the same convolution kernel for detection (with
@ref{NoiseChisel}) and segmentation, with this option, you can use the same
convolved image (that is also available in NoiseChisel) and avoid two
convolutions.
-However, just be careful to use the input to NoiseChisel as the input to
Segment also, then use the @option{--sky} and @option{--std} to specify the Sky
and its standard deviation (from NoiseChisel's output).
-Recall that when NoiseChisel is not called with @option{--rawoutput}, the
first extension of NoiseChisel's output is the @emph{Sky-subtracted} input (see
@ref{NoiseChisel output}).
-So if you use the same convolved image that you fed to NoiseChisel, but use
NoiseChisel's output with Segment's @option{--convolved}, then the convolved
image will not be Sky subtracted.
+@item --openingngb=INT
+The structuring element used for opening, see @option{--erodengb} for more
information about a structuring element.
-@item --chdu
-The HDU/extension containing the convolved image (given to
@option{--convolved}).
-For acceptable values, please see the description of @option{--hdu} in
@ref{Input output options}.
+@item --skyfracnoblank
+Ignore blank pixels when estimating the fraction of undetected pixels for Sky
estimation.
+NoiseChisel only measures the Sky over the tiles that have a sufficiently
large fraction of undetected pixels (value given to @option{--minskyfrac}).
+By default this fraction is found by dividing number of undetected pixels in a
tile by the tile's area.
+But this default behavior ignores the possibility of blank pixels.
+In situations that blank/masked pixels are scattered across the image and if
they are large enough, all the tiles can fail the @option{--minskyfrac} test,
thus not allowing NoiseChisel to proceed.
+With this option, such scenarios can be fixed: the denominator of the fraction
will be the number of non-blank elements in the tile, not the total tile area.
-@item -L INT[,INT]
-@itemx --largetilesize=INT[,INT]
-The size of the large tiles to use for identifying the clump S/N threshold
over the undetected regions.
-The usage of this option is identical to NoiseChisel's
@option{--largetilesize} option (@ref{NoiseChisel input}).
-Please see the descriptions there for more.
+@item -B FLT
+@itemx --minskyfrac=FLT
+Minimum fraction (value between 0 and 1) of Sky (undetected) areas in a tile.
+Only tiles with a fraction of undetected pixels (Sky) larger than this value
will be used to estimate the Sky value.
+NoiseChisel uses this option value twice to estimate the Sky value: after
initial detections and in the end when false detections have been removed.
-The undetected regions can be a significant fraction of the dataset and
finding clumps requires sorting of the desired regions, which can be slow.
-To speed up the processing, Segment finds clumps in the undetected regions
over separate large tiles.
-This allows it to have to sort a much smaller set of pixels and also to treat
them independently and in parallel.
-Both these issues greatly speed it up.
-Just be sure to not decrease the large tile sizes too much (less than 100
pixels in each dimension).
-It is important for them to be much larger than the clumps.
+Because of the PSF and their intrinsic amorphous properties, astronomical
objects (except cosmic rays) never have a clear cutoff and commonly sink into
the noise very slowly.
+Even below the very low thresholds used by NoiseChisel.
+So when a large fraction of the area of one mesh is covered by detections, it
is very plausible that their faint wings are present in the undetected regions
(hence causing a bias in any measurement).
+To get an accurate measurement of the above parameters over the tessellation,
tiles that harbor too many detected regions should be excluded.
+The used tiles are visible in the respective @option{--check} option of the
given step.
-@end table
+@item --checkdetsky
+Check the initial approximation of the sky value and its standard deviation in
a FITS file ending with @file{_detsky.fits}.
+With this option, NoiseChisel will abort as soon as the sky value used for
defining pseudo-detections is complete.
+This allows you to inspect the steps leading to the final quantile threshold,
this behavior can be disabled with @option{--continueaftercheck}.
+By default the output will have the same pixel size as the input, but with the
@option{--oneelempertile} option, only one pixel will be used for each tile
(see @ref{Processing options}).
+@item -s FLT,FLT
+@itemx --sigmaclip=FLT,FLT
+The @mymath{\sigma}-clipping parameters for measuring the initial and final
Sky values from the undetected pixels, see @ref{Sigma clipping}.
-@node Segmentation options, Segment output, Segment input, Invoking astsegment
-@subsubsection Segmentation options
+This option takes two values which are separated by a comma (@key{,}).
+Each value can either be written as a single number or as a fraction of two
numbers (for example, @code{3,1/10}).
+The first value to this option is the multiple of @mymath{\sigma} that will be
clipped (@mymath{\alpha} in that section).
+The second value is the exit criteria.
+If it is less than 1, then it is interpreted as tolerance and if it is larger
than one it is assumed to be the fixed number of iterations.
+Hence, in the latter case the value must be an integer.
-The options below can be used to configure every step of the segmentation
process in the Segment program.
-For a more complete explanation (with figures to demonstrate each step),
please see Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and
Ichikawa [2015]}, and also @ref{Segment}.
-By default, Segment will follow the procedure described in the paper to find
the S/N threshold based on the noise properties.
-This can be disabled by directly giving a trustable signal-to-noise ratio to
the @option{--clumpsnthresh} option.
+@item -R FLT
+@itemx --dthresh=FLT
+The detection threshold: a multiple of the initial Sky standard deviation
added with the initial Sky approximation (which you can inspect with
@option{--checkdetsky}).
+This flux threshold is applied to the initially undetected regions on the
unconvolved image.
+The background pixels that are completely engulfed in a 4-connected foreground
region are converted to background (holes are filled) and one opening (depth of
1) is applied over both the initially detected and undetected regions.
+The Signal to noise ratio of the resulting `pseudo-detections' are used to
identify true vs. false detections.
+See Section 3.1.5 and Figure 7 in Akhlaghi and Ichikawa (2015) for a very
complete explanation.
-Recall that you can always see the full list of Gnuastro's options with the
@option{--help} (see @ref{Getting help}), or @option{--printparams} (or
@option{-P}) to see their values (see @ref{Operating mode options}).
+@item --dopening=INT
+The number of openings to do after applying @option{--dthresh}.
-@table @option
+@item --dopeningngb=INT
+The connectivity used in the opening of @option{--dopening}.
+In a 2D image this must be either 4 or 8.
+The stronger the connectivity, the more smaller regions will be discarded.
-@item -B FLT
-@itemx --minskyfrac=FLT
-Minimum fraction (value between 0 and 1) of Sky (undetected) areas in a large
tile.
-Only (large) tiles with a fraction of undetected pixels (Sky) greater than
this value will be used for finding clumps.
-The clumps found in the undetected areas will be used to estimate a S/N
threshold for true clumps.
-Therefore this is an important option (to decrease) in crowded fields.
-Operationally, this is almost identical to NoiseChisel's @option{--minskyfrac}
option (@ref{Detection options}).
-Please see the descriptions there for more.
+@item --holengb=INT
+The connectivity (defined by the number of neighbors) to fill holes after
applying @option{--dthresh} (above) to find pseudo-detections.
+For example, in a 2D image it must be 4 (the neighbors that are most strongly
connected) or 8 (all neighbors).
+The stronger the connectivity, the stronger the hole will be enclosed.
+So setting a value of 8 in a 2D image means that the walls of the hole are
4-connected.
+If standard (near Sky level) values are given to @option{--dthresh}, setting
@option{--holengb=4}, might fill the complete dataset and thus not create
enough pseudo-detections.
-@item --minima
-Build the clumps based on the local minima, not maxima.
-By default, clumps are built starting from local maxima (see Figure 8 of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}).
-Therefore, this option can be useful when you are searching for true local
minima (for example, absorption features).
+@item --pseudoconcomp=INT
+The connectivity (defined by the number of neighbors) to find individual
pseudo-detections.
+If it is a weaker connectivity (4 in a 2D image), then pseudo-detections that
are connected on the corners will be treated as separate.
@item -m INT
@itemx --snminarea=INT
-The minimum area which a clump in the undetected regions should have in order
to be considered in the clump Signal to noise ratio measurement.
-If this size is set to a small value, the Signal to noise ratio of false
clumps will not be accurately found.
-It is recommended that this value be larger than the value to NoiseChisel's
@option{--snminarea}.
-Because the clumps are found on the convolved (smoothed) image while the
pseudo-detections are found on the input image.
-You can use @option{--checksn} and @option{--checksegmentation} to see if your
chosen value is reasonable or not.
+The minimum area to calculate the Signal to noise ratio on the
pseudo-detections of both the initially detected and undetected regions.
+When the area in a pseudo-detection is too small, the Signal to noise ratio
measurements will not be accurate and their distribution will be heavily skewed
to the positive.
+So it is best to ignore any pseudo-detection that is smaller than this area.
+Use @option{--detsnhistnbins} to check if this value is reasonable or not.
@item --checksn
-Save the S/N values of the clumps over the sky and detected regions into
separate tables.
-If @option{--tableformat} is a FITS format, each table will be written into a
separate extension of one file suffixed with @file{_clumpsn.fits}.
-If it is plain text, a separate file will be made for each table (ending in
@file{_clumpsn_sky.txt} and @file{_clumpsn_det.txt}).
+Save the S/N values of the pseudo-detections (and possibly grown detections if
@option{--cleangrowndet} is called) into separate tables.
+If @option{--tableformat} is a FITS table, each table will be written into a
separate extension of one file suffixed with @file{_detsn.fits}.
+If it is plain text, a separate file will be made for each table (ending in
@file{_detsn_sky.txt}, @file{_detsn_det.txt} and @file{_detsn_grown.txt}).
For more on @option{--tableformat} see @ref{Input output options}.
-You can use these tables to inspect the S/N values and their distribution (in
combination with the @option{--checksegmentation} option to see where the
clumps are).
-You can use Gnuastro's @ref{Statistics} to make a histogram of the
distribution (ready for plotting in a text file, or a crude ASCII-art
demonstration on the command-line).
-
-With this option, Segment will abort as soon as the two tables are created.
-This allows you to inspect the steps leading to the final S/N quantile
threshold, this behavior can be disabled with @option{--continueaftercheck}.
+You can use these to inspect the S/N values and their distribution (in
combination with the @option{--checkdetection} option to see where the
pseudo-detections are).
+You can use Gnuastro's @ref{Statistics} to make a histogram of the
distribution or any other analysis you would like for better understanding of
the distribution (for example, through a histogram).
@item --minnumfalse=INT
-The minimum number of clumps over undetected (Sky) regions to identify the
requested Signal-to-Noise ratio threshold.
-Operationally, this is almost identical to NoiseChisel's
@option{--minnumfalse} option (@ref{Detection options}).
-Please see the descriptions there for more.
+The minimum number of `pseudo-detections' over the undetected regions to
identify a Signal-to-Noise ratio threshold.
+The Signal to noise ratio (S/N) of false pseudo-detections in each tile is
found using the quantile of the S/N distribution of the pseudo-detections over
the undetected pixels in each mesh.
+If the number of S/N measurements is not large enough, the quantile will not
be accurate (can have large scatter).
+For example, if you set @option{--snquant=0.99} (or the top 1 percent), then
it is best to have at least 100 S/N measurements.
@item -c FLT
@itemx --snquant=FLT
-The quantile of the signal-to-noise ratio distribution of clumps in undetected
regions, used to define true clumps.
-After identifying all the usable clumps in the undetected regions of the
dataset, the given quantile of their signal-to-noise ratios is used to define
the signal-to-noise ratio of a ``true'' clump.
-Effectively, this can be seen as an inverse p-value measure.
-See Figure 9 and Section 3.2.1 of @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa [2015]} for a complete explanation.
-The full distribution of clump signal-to-noise ratios over the undetected
areas can be saved into a table with @option{--checksn} option and visually
inspected with @option{--checksegmentation}.
+The quantile of the Signal to noise ratio distribution of the
pseudo-detections in each mesh to use for filling the large mesh grid.
+Note that this is only calculated for the large mesh grids that satisfy the
minimum fraction of undetected pixels (value of @option{--minbfrac}) and
minimum number of pseudo-detections (value of @option{--minnumfalse}).
+
+@item --snthresh=FLT
+Manually set the signal-to-noise ratio of true pseudo-detections.
+With this option, NoiseChisel will not attempt to find pseudo-detections over
the noisy regions of the dataset, but will directly go onto applying the
manually input value.
+
+This option is useful in crowded images where there is no blank sky to find
the sky pseudo-detections.
+You can get this value on a similarly reduced dataset (from another region of
the Sky with more undetected regions spaces).
+
+@item -d FLT
+@itemx --detgrowquant=FLT
+Quantile limit to ``grow'' the final detections.
+As discussed in the previous options, after applying the initial quantile
threshold, layers of pixels are carved off the objects to identify true signal.
+With this step you can return those low surface brightness layers that were
carved off back to the detections.
+To disable growth, set the value of this option to @code{1}.
+
+The process is as follows: after the true detections are found, all the
non-detected pixels above this quantile will be put in a list and used to
``grow'' the true detections (seeds of the growth).
+Like all quantile thresholds, this threshold is defined and applied to the
convolved dataset.
+Afterwards, the dataset is dilated once (with minimum connectivity) to connect
very thin regions on the boundary: imagine building a dam at the point rivers
spill into an open sea/ocean.
+Finally, all holes are filled.
+In the geography metaphor, holes can be seen as the closed (by the dams)
rivers and lakes, so this process is like turning the water in all such rivers
and lakes into soil.
+See @option{--detgrowmaxholesize} for configuring the hole filling.
+
+Note that since the growth occurs on all neighbors of a data element, the
+quantile for 3D detection must be must larger than that of 2D
+detection. Recall that in 2D each element has 8 neighbors while in 3D there
+are 27 neighbors.
+
+@item --detgrowmaxholesize=INT
+The maximum hole size to fill during the final expansion of the true
detections as described in @option{--detgrowquant}.
+This is necessary when the input contains many smaller objects and can be used
to avoid marking blank sky regions as detections.
-@item -v
-@itemx --keepmaxnearriver
-Keep a clump whose maximum (minimum if @option{--minima} is called) flux is
8-connected to a river pixel.
-By default such clumps over detections are considered to be noise and are
removed irrespective of their significance measure (see
@url{https://arxiv.org/abs/1909.11230,Akhlaghi 2019}).
-Over large profiles, that sink into the noise very slowly, noise can cause
part of the profile (which was flat without noise) to become a very large and
with a very high Signal to noise ratio.
-In such cases, the pixel with the maximum flux in the clump will be
immediately touching a river pixel.
+For example, multiple galaxies can be positioned such that they surround an
empty region of sky.
+If all the holes are filled, the Sky region in between them will be taken as a
detection which is not desired.
+To avoid such cases, the integer given to this option must be smaller than the
hole between such objects.
+However, we should caution that unless the ``hole'' is very large, the
combined faint wings of the galaxies might actually be present in between them,
so be very careful in not filling such holes.
-@item -s FLT
-@itemx --clumpsnthresh=FLT
-The signal-to-noise threshold for true clumps.
-If this option is given, then the segmentation options above will be ignored
and the given value will be directly used to identify true clumps over the
detections.
-This can be useful if you have a large dataset with similar noise properties.
-You can find a robust signal-to-noise ratio based on a (sufficiently large)
smaller portion of the dataset.
-Afterwards, with this option, you can speed up the processing on the whole
dataset.
-Other scenarios where this option may be useful is when, the image might not
contain enough/any Sky regions.
+On the other hand, if you have a very large (and extended) galaxy, the diffuse
wings of the galaxy may create very large holes over the detections.
+In such cases, a large enough value to this option will cause all such holes
to be detected as part of the large galaxy and thus help in detecting it to
extremely low surface brightness limits.
+Therefore, especially when large and extended objects are present in the
image, it is recommended to give this option (very) large values.
+For one real-world example, see @ref{Detecting large extended targets}.
-@item -G FLT
-@itemx --gthresh=FLT
-Threshold (multiple of the sky standard deviation added with the sky) to stop
growing true clumps.
-Once true clumps are found, they are set as the basis to segment the detected
region.
-They are grown until the threshold specified by this option.
+@item --cleangrowndet
+After dilation, if the signal-to-noise ratio of a detection is less than the
derived pseudo-detection S/N limit, that detection will be discarded.
+In an ideal/clean noise, a true detection's S/N should be larger than its
constituent pseudo-detections because its area is larger and it also covers
more signal.
+However, on a false detections (especially at lower @option{--snquant}
values), the increase in size can cause a decrease in S/N below that threshold.
-@item -y INT
-@itemx --minriverlength=INT
-The minimum length of a river between two grown clumps for it to be considered
in signal-to-noise ratio estimations.
-Similar to @option{--snminarea}, if the length of the river is too short, the
signal-to-noise ratio can be noisy and unreliable.
-Any existing rivers shorter than this length will be considered as
non-existent, independent of their Signal to noise ratio.
-The clumps are grown on the input image, therefore this value can be smaller
than the value given to @option{--snminarea}.
-Recall that the clumps were defined on the convolved image so
@option{--snminarea} should be larger.
+This will improve purity and not change completeness (a true detection will
not be discarded).
+Because a true detection has flux in its vicinity and dilation will catch more
of that flux and increase the S/N.
+So on a true detection, the final S/N cannot be less than pseudo-detections.
-@item -O FLT
-@itemx --objbordersn=FLT
-The maximum Signal to noise ratio of the rivers between two grown clumps in
order to consider them as separate `objects'.
-If the Signal to noise ratio of the river between two grown clumps is larger
than this value, they are defined to be part of one `object'.
-Note that the physical reality of these `objects' can never be established
with one image, or even multiple images from one broad-band filter.
-Any method we devise to define `object's over a detected region is ultimately
subjective.
+However, in many real images bad processing creates artifacts that cannot be
accurately removed by the Sky subtraction.
+In such cases, this option will decrease the completeness (will artificially
discard true detections).
+So this feature is not default and should to be explicitly called when you
know the noise is clean.
-Two very distant galaxies or satellites in one halo might lie in the same line
of sight and be detected as clumps on one detection.
-On the other hand, the connection (through a spiral arm or tidal tail for
example) between two parts of one galaxy might have such a low surface
brightness that they are broken up into multiple detections or objects.
-In fact if you have noticed, exactly for this purpose, this is the only Signal
to noise ratio that the user gives into NoiseChisel.
-The `true' detections and clumps can be objectively identified from the noise
characteristics of the image, so you do not have to give any hand input Signal
to noise ratio.
-@item --checksegmentation
-A file with the suffix @file{_seg.fits} will be created.
-This file keeps all the relevant steps in finding true clumps and segmenting
the detections into multiple objects in various extensions.
-Having read the paper or the steps above.
-Examining this file can be an excellent guide in choosing the best set of
parameters.
-Note that calling this function will significantly slow NoiseChisel.
-In verbose mode (without the @option{--quiet} option, see @ref{Operating mode
options}) the important steps (along with their extension names) will also be
reported.
+@item --checkdetection
+Every step of the detection process will be added as an extension to a file
with the suffix @file{_det.fits}.
+Going through each would just be a repeat of the explanations above and also
of those in Akhlaghi and Ichikawa (2015).
+The extension label should be sufficient to recognize which step you are
observing.
+Viewing all the steps can be the best guide in choosing the best set of
parameters.
+With this option, NoiseChisel will abort as soon as a snapshot of all the
detection process is saved.
+This behavior can be disabled with @option{--continueaftercheck}.
-With this option, NoiseChisel will abort as soon as the two tables are created.
+@item --checksky
+Check the derivation of the final sky and its standard deviation values on the
mesh grid.
+With this option, NoiseChisel will abort as soon as the sky value is estimated
over the image (on each tile).
This behavior can be disabled with @option{--continueaftercheck}.
+By default the output will have the same pixel size as the input, but with the
@option{--oneelempertile} option, only one pixel will be used for each tile
(see @ref{Processing options}).
@end table
-@node Segment output, , Segmentation options, Invoking astsegment
-@subsubsection Segment output
-The main output of Segment are two label datasets (with integer types,
separating the dataset's elements into different classes).
-They have HDU/extension names of @code{CLUMPS} and @code{OBJECTS}.
-Similar to all Gnuastro's FITS outputs, the zero-th extension/HDU of the main
output file only contains header keywords and image or table.
-It contains the Segment input files and parameters (option names and values)
as FITS keywords.
-Note that if an option name is longer than 8 characters, the keyword name is
the second word.
-The first word is @code{HIERARCH}.
-Also note that according to the FITS standard, the keyword names must be in
capital letters, therefore, if you want to use Grep to inspect these keywords,
use the @option{-i} option, like the example below.
-@example
-$ astfits image_segmented.fits -h0 | grep -i snquant
-@end example
-@cindex DS9
-@cindex SAO DS9
-By default, besides the @code{CLUMPS} and @code{OBJECTS} extensions, Segment's
output will also contain the (technically redundant) input dataset and the sky
standard deviation dataset (if it was not a constant number).
-This can help in visually inspecting the result when viewing the images as a
``Multi-extension data cube'' in SAO DS9 for example, (see @ref{Viewing FITS
file contents with DS9 or TOPCAT}).
-You can simply flip through the extensions and see the same region of the
image and its corresponding clumps/object labels.
-It also makes it easy to feed the output (as one file) into MakeCatalog when
you intend to make a catalog afterwards (see @ref{MakeCatalog}.
-To remove these redundant extensions from the output (for example, when
designing a pipeline), you can use @option{--rawoutput}.
+@node NoiseChisel output, , Detection options, Invoking astnoisechisel
+@subsubsection NoiseChisel output
-The @code{OBJECTS} and @code{CLUMPS} extensions can be used as input into
@ref{MakeCatalog} to generate a catalog for higher-level analysis.
-If you want to treat each clump separately, you can give a very large value
(or even a NaN, which will always fail) to the @option{--gthresh} option (for
example, @code{--gthresh=1e10} or @code{--gthresh=nan}), see @ref{Segmentation
options}.
+NoiseChisel's output is a multi-extension FITS file.
+The main extension/dataset is a (binary) detection map.
+It has the same size as the input but with only two possible values for all
pixels: 0 (for pixels identified as noise) and 1 (for those identified as
signal/detections).
+The detection map is followed by a Sky and Sky standard deviation dataset
(which are calculated from the binary image).
+By default (when @option{--rawoutput} is not called), NoiseChisel will also
subtract the Sky value from the input and save the sky-subtracted input as the
first extension in the output with data.
+The zero-th extension (that contains no data), contains NoiseChisel's
configuration as FITS keywords, see @ref{Output FITS files}.
-For a complete definition of clumps and objects, please see Section 3.2 of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]} and
@ref{Segmentation options}.
-The clumps are ``true'' local maxima (minima if @option{--minima} is called)
and their surrounding pixels until a local minimum/maximum (caused by noise
fluctuations, or another ``true'' clump).
-Therefore it may happen that some of the input detections are not covered by
clumps at all (very diffuse objects without any strong peak), while some
objects may contain many clumps.
-Even in those that have clumps, there will be regions that are too diffuse.
-The diffuse regions (within the input detected regions) are given a negative
label (-1) to help you separate them from the undetected regions (with a value
of zero).
+The name of the output file can be set by giving a value to @option{--output}
(this is a common option between all programs and is therefore discussed in
@ref{Input output options}).
+If @option{--output} is not used, the input name will be suffixed with
@file{_detected.fits} and used as output, see @ref{Automatic output}.
+If any of the options starting with @option{--check*} are given, NoiseChisel
will not complete and will abort as soon as the respective check images are
created.
+For more information on the different check images, see the description for
the @option{--check*} options in @ref{Detection options} (this can be disabled
with @option{--continueaftercheck}).
-Each clump is labeled with respect to its host object.
-Therefore, if an object has three clumps for example, the clumps within it
have labels 1, 2 and 3.
-As a result, if an initial detected region has multiple objects, each with a
single clump, all the clumps will have a label of 1.
-The total number of clumps in the dataset is stored in the @code{NCLUMPS}
keyword of the @code{CLUMPS} extension and printed in the verbose output of
Segment (when @option{--quiet} is not called).
+The last two extensions of the output are the Sky and its Standard deviation,
see @ref{Sky value} for a complete explanation.
+They are calculated on the tile grid that you defined for NoiseChisel.
+By default these datasets will have the same size as the input, but with all
the pixels in one tile given one value.
+To be more space-efficient (keep only one pixel per tile), you can use the
@option{--oneelempertile} option, see @ref{Tessellation}.
-The @code{OBJECTS} extension of the output will give a positive counter/label
to every detected pixel in the input.
-As described in Akhlaghi and Ichikawa [2015], the true clumps are grown until
a certain threshold.
-If the grown clumps touch other clumps and the connection is strong enough,
they are considered part of the same @emph{object}.
-Once objects (grown clumps) are identified, they are grown to cover the whole
detected area.
+@cindex GNOME
+To inspect any of NoiseChisel's output files, assuming you use SAO DS9, you
can configure your Graphic User Interface (GUI) to open NoiseChisel's output as
a multi-extension data cube.
+This will allow you to flip through the different extensions and visually
inspect the results.
+This process has been described for the GNOME GUI (most common GUI in
GNU/Linux operating systems) in @ref{Viewing FITS file contents with DS9 or
TOPCAT}.
-The options to configure the output of Segment are listed below:
+NoiseChisel's output configuration options are described in detail below.
@table @option
@item --continueaftercheck
-Do Not abort Segment after producing the check image(s).
-The usage of this option is identical to NoiseChisel's
@option{--continueaftercheck} option (@ref{NoiseChisel input}).
-Please see the descriptions there for more.
+Continue NoiseChisel after any of the options starting with @option{--check}
(see @ref{Detection options}.
+NoiseChisel involves many steps and as a result, there are many checks,
allowing you to inspect the status of the processing.
+The results of each step affect the next steps of processing.
+Therefore, when you want to check the status of the processing at one step,
the time spent to complete NoiseChisel is just wasted/distracting time.
-@item --noobjects
-Abort Segment after finding true clumps and do not continue with finding
options.
-Therefore, no @code{OBJECTS} extension will be present in the output.
-Each true clump in @code{CLUMPS} will get a unique label, but diffuse regions
will still have a negative value.
+To encourage easier experimentation with the option values, when you use any
of the NoiseChisel options that start with @option{--check}, NoiseChisel will
abort once its desired extensions have been written.
+With @option{--continueaftercheck} option, you can disable this behavior and
ask NoiseChisel to continue with the rest of the processing, even after the
requested check files are complete.
-To make a catalog of the clumps, the input detection map (where all the labels
are one) can be fed into @ref{MakeCatalog} along with the input detection map
to Segment (that only had a value of @code{1} for all detected pixels) with
@option{--clumpscat}.
-In this way, MakeCatalog will assume all the clumps belong to a single
``object''.
+@item --ignoreblankintiles
+Do Not set the input's blank pixels to blank in the tiled outputs (for
example, Sky and Sky standard deviation extensions of the output).
+This is only applicable when the tiled output has the same size as the input,
in other words, when @option{--oneelempertile} is not called.
-@item --grownclumps
-In the output @code{CLUMPS} extension, store the grown clumps.
-If a detected region contains no clumps or only one clump, then it will be
fully given a label of @code{1} (no negative valued pixels).
+By default, blank values in the input (commonly on the edges which are outside
the survey/field area) will be set to blank in the tiled outputs also.
+But in other scenarios this default behavior is not desired; for example, if
you have masked something in the input, but want the tiled output under that
also.
+
+@item -l
+@itemx --label
+Run a connected-components algorithm on the finally detected pixels to
identify which pixels are connected to which.
+By default the main output is a binary dataset with only two values: 0 (for
noise) and 1 (for signal/detections).
+See @ref{NoiseChisel output} for more.
+
+The purpose of NoiseChisel is to detect targets that are extended and diffuse,
with outer parts that sink into the noise very gradually (galaxies and stars
for example).
+Since NoiseChisel digs down to extremely low surface brightness values, many
such targets will commonly be detected together as a single large body of
connected pixels.
+
+To properly separate connected objects, sophisticated segmentation methods are
commonly necessary on NoiseChisel's output.
+Gnuastro has the dedicated @ref{Segment} program for this job.
+Since input images are commonly large and can take a significant volume, the
extra volume necessary to store the labels of the connected components in the
detection map (which will be created with this @option{--label} option, in
32-bit signed integer type) can thus be a major waste of space.
+Since the default output is just a binary dataset, an 8-bit unsigned dataset
is enough.
+
+The binary output will also encourage users to segment the result separately
prior to doing higher-level analysis.
+As an alternative to @option{--label}, if you have the binary detection image,
you can use the @code{connected-components} operator in Gnuastro's Arithmetic
program to identify regions that are connected with each other.
+For example, with this command (assuming NoiseChisel's output is called
@file{nc.fits}):
+
+@example
+$ astarithmetic nc.fits 2 connected-components -hDETECTIONS
+@end example
@item --rawoutput
-Only write the @code{CLUMPS} and @code{OBJECTS} datasets in the output file.
-Without this option (by default), the first and last extensions of the output
will the Sky-subtracted input dataset and the Sky standard deviation dataset
(if it was not a number).
-When the datasets are small, these redundant extensions can make it convenient
to inspect the results visually or feed the output to @ref{MakeCatalog} for
measurements.
-Ultimately both the input and Sky standard deviation datasets are redundant
(you had them before running Segment).
-When the inputs are large/numerous, these extra dataset can be a burden.
+Do Not include the Sky-subtracted input image as the first extension of the
output.
+By default, the Sky-subtracted input is put in the first extension of the
output.
+The next extensions are NoiseChisel's main outputs described above.
+
+The extra Sky-subtracted input can be convenient in checking NoiseChisel's
output and comparing the detection map with the input: visually see if
everything you expected is detected (reasonable completeness) and that you do
not have too many false detections (reasonable purity).
+This visual inspection is simplified if you use SAO DS9 to view NoiseChisel's
output as a multi-extension data-cube, see @ref{Viewing FITS file contents with
DS9 or TOPCAT}.
+
+When you are satisfied with your NoiseChisel configuration (therefore you do
not need to check on every run), or you want to archive/transfer the outputs,
or the datasets become large, or you are running NoiseChisel as part of a
pipeline, this Sky-subtracted input image can be a significant burden (take up
a large volume).
+The fact that the input is also noisy, makes it hard to compress it
efficiently.
+
+In such cases, this @option{--rawoutput} can be used to avoid the extra
sky-subtracted input in the output.
+It is always possible to easily produce the Sky-subtracted dataset from the
input (assuming it is in extension @code{1} of @file{in.fits}) and the
@code{SKY} extension of NoiseChisel's output (let's call it @file{nc.fits})
with a command like below (assuming NoiseChisel was not run with
@option{--oneelempertile}, see @ref{Tessellation}):
+
+@example
+$ astarithmetic in.fits nc.fits - -h1 -hSKY
+@end example
@end table
@cartouche
@noindent
@cindex Compression
-@strong{Save space:} with the @option{--rawoutput}, Segment's output will only
be two labeled datasets (only containing integers).
-Since they have no noise, such datasets can be compressed very effectively
(without any loss of data) with exceptionally high compression ratios.
-You can use the following command to compress it with the best ratio:
+@strong{Save space:} with the @option{--rawoutput} and
@option{--oneelempertile}, NoiseChisel's output will only be one binary
detection map and two much smaller arrays with one value per tile.
+Since none of these have noise they can be compressed very effectively
(without any loss of data) with exceptionally high compression ratios.
+This makes it easy to archive, or transfer, NoiseChisel's output even on huge
datasets.
+To compress it with the most efficient method (take up less volume), run the
following command:
@cindex GNU Gzip
@example
-$ gzip --best segment_output.fits
+$ gzip --best noisechisel_output.fits
@end example
@noindent
-The resulting @file{.fits.gz} file can then be fed into any of Gnuastro's
programs directly, without having to decompress it separately (it will just
take them a little longer, because they have to decompress it internally before
use).
+The resulting @file{.fits.gz} file can then be fed into any of Gnuastro's
programs directly, or viewed in viewers like SAO DS9, without having to
decompress it separately (they will just take a little longer, because they
have to internally decompress it before starting).
+See @ref{NoiseChisel optimization for storage} for an example on a real
dataset.
@end cartouche
-When the input is a 2D image, to inspect NoiseChisel's output you can
configure SAO DS9 in your Graphic User Interface (GUI) to open NoiseChisel's
output as a multi-extension data cube.
-This will allow you to flip through the different extensions and visually
inspect the results.
-This process has been described for the GNOME GUI (most common GUI in
GNU/Linux operating systems) in @ref{Viewing FITS file contents with DS9 or
TOPCAT}.
@@ -26040,2450 +26096,2489 @@ This process has been described for the GNOME GUI
(most common GUI in GNU/Linux
-@node MakeCatalog, Match, Segment, Data analysis
-@section MakeCatalog
+@node Segment, MakeCatalog, NoiseChisel, Data analysis
+@section Segment
+
+Once signal is separated from noise (for example, with @ref{NoiseChisel}), you
have a binary dataset: each pixel is either signal (1) or noise (0).
+Signal (for example, every galaxy in your image) has been ``detected'', but
all detections have a label of 1.
+Therefore while we know which pixels contain signal, we still cannot find out
how many galaxies they contain or which detected pixels correspond to which
galaxy.
+At the lowest (most generic) level, detection is a kind of segmentation
(segmenting the whole dataset into signal and noise, see @ref{NoiseChisel}).
+Here, we will define segmentation only on signal: to separate sub-structure
within the detections.
+
+@cindex Connected component labeling
+If the targets are clearly separated, or their detected regions are not
touching, a simple connected
components@footnote{@url{https://en.wikipedia.org/wiki/Connected-component_labeling}}
algorithm (very basic segmentation) is enough to separate the regions that are
touching/connected.
+This is such a basic and simple form of segmentation that Gnuastro's
Arithmetic program has an operator for it: see @code{connected-components} in
@ref{Arithmetic operators}.
+Assuming the binary dataset is called @file{binary.fits}, you can use it with
a command like this:
+
+@example
+$ astarithmetic binary.fits 2 connected-components
+@end example
+
+@noindent
+You can even do a very basic detection (a threshold, say at value
+@code{100}) @emph{and} segmentation in Arithmetic with a single command
+like below:
+
+@example
+$ astarithmetic in.fits 100 gt 2 connected-components
+@end example
+
+However, in most astronomical situations our targets are not nicely separated
or have a sharp boundary/edge (for a threshold to suffice): they touch (for
example, merging galaxies), or are simply in the same line-of-sight (which is
much more common).
+This causes their images to overlap.
+
+In particular, when you do your detection with NoiseChisel, you will detect
signal to very low surface brightness limits: deep into the faint wings of
galaxies or bright stars (which can extend very far and irregularly from their
center).
+Therefore, it often happens that several galaxies are detected as one large
detection.
+Since they are touching, a simple connected components algorithm will not
suffice.
+It is therefore necessary to do a more sophisticated segmentation and break up
the detected pixels (even those that are touching) into multiple target objects
as accurately as possible.
+
+Segment will use a detection map and its corresponding dataset to find
sub-structure over the detected areas and use them for its segmentation.
+Until Gnuastro version 0.6 (released in 2018), Segment was part of
@ref{NoiseChisel}.
+Therefore, similar to NoiseChisel, the best place to start reading about
Segment and understanding what it does (with many illustrative figures) is
Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
[2015]}, and continue with @url{https://arxiv.org/abs/1909.11230, Akhlaghi
[2019]}.
+
+@cindex river
+@cindex Watershed algorithm
+As a summary, Segment first finds true @emph{clump}s over the detections.
+Clumps are associated with local maxima/minima@footnote{By default the maximum
is used as the first clump pixel, to define clumps based on local minima, use
the @option{--minima} option.} and extend over the neighboring pixels until
they reach a local minimum/maximum (@emph{river}/@emph{watershed}).
+By default, Segment will use the distribution of clump signal-to-noise ratios
over the undetected regions as reference to find ``true'' clumps over the
detections.
+Using the undetected regions can be disabled by directly giving a
signal-to-noise ratio to @option{--clumpsnthresh}.
+
+The true clumps are then grown to a certain threshold over the detections.
+Based on the strength of the connections (rivers/watersheds) between the grown
clumps, they are considered parts of one @emph{object} or as separate
@emph{object}s.
+See Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and
Ichikawa [2015]} for more.
+Segment's main output are thus two labeled datasets: 1) clumps, and 2) objects.
+See @ref{Segment output} for more.
+
+To start learning about Segment, especially in relation to detection
(@ref{NoiseChisel}) and measurement (@ref{MakeCatalog}), the recommended
references are @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
[2015]}, @url{https://arxiv.org/abs/1611.06387, Akhlaghi [2016]} and
@url{https://arxiv.org/abs/1909.11230, Akhlaghi [2019]}.
+If you have used Segment within your research, please run it with
@option{--cite} to list the papers you should cite and how to acknowledge its
funding sources.
+
+Those papers cannot be updated any more but the software will evolve.
+For example, Segment became a separate program (from NoiseChisel) in 2018
(after those papers were published).
+Therefore this book is the definitive reference.
+@c To help in the transition from those papers to the software you are using,
see @ref{Segment changes after publication}.
+Finally, in @ref{Invoking astsegment}, we will discuss Segment's inputs,
outputs and configuration options.
+
+
+@menu
+* Invoking astsegment:: Inputs, outputs and options to Segment
+@end menu
+
+@c @node Segment changes after publication, Invoking astsegment, Segment,
Segment
+@c @subsection Segment changes after publication
+
+@c Segment's main algorithm and working strategy were initially defined and
introduced in Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi
and Ichikawa [2015]} and @url{https://arxiv.org/abs/1909.11230, Akhlaghi
[2019]}.
+@c It is strongly recommended to read those papers for a good understanding of
what Segment does, how it relates to detection, and how each parameter
influences the output.
+@c They have many figures showing every step on multiple mock and real
examples.
+
+@c However, the papers cannot be updated anymore, but Segment has evolved (and
will continue to do so): better algorithms or steps have been (and will be)
found.
+@c This book is thus the final and definitive guide to Segment.
+@c The aim of this section is to make the transition from the paper to your
installed version, as smooth as possible through the list below.
+@c For a more detailed list of changes in previous Gnuastro releases/versions,
please follow the @file{NEWS} file@footnote{The @file{NEWS} file is present in
the released Gnuastro tarball, see @ref{Release tarball}.}.
+
+@node Invoking astsegment, , Segment, Segment
+@subsection Invoking Segment
-At the lowest level, a dataset (for example, an image) is just a collection of
values, placed after each other in any number of dimensions (for example, an
image is a 2D dataset).
-Each data-element (pixel) just has two properties: its position (relative to
the rest) and its value.
-In higher-level analysis, an entire dataset (an image for example) is rarely
treated as a singular entity@footnote{You can derive the over-all properties of
a complete dataset (1D table column, 2D image, or 3D data-cube) treated as a
single entity with Gnuastro's Statistics program (see @ref{Statistics}).}.
-You usually want to know/measure the properties of the (separate)
scientifically interesting targets that are embedded in it.
-For example, the magnitudes, positions and elliptical properties of the
galaxies that are in the image.
+Segment will identify substructure within the detected regions of an input
image.
+Segment's output labels can be directly used for measurements (for example,
with @ref{MakeCatalog}).
+The executable name is @file{astsegment} with the following general template
-MakeCatalog is Gnuastro's program for localized measurements over a dataset.
-In other words, MakeCatalog is Gnuastro's program to convert low-level
datasets (like images), to high level catalogs.
-The role of MakeCatalog in a scientific analysis and the benefits of its model
(where detection/segmentation is separated from measurement) is discussed in
@url{https://arxiv.org/abs/1611.06387v1, Akhlaghi [2016]}@footnote{A published
paper cannot undergo any more change, so this manual is the definitive guide.}
and summarized in @ref{Detection and catalog production}.
-We strongly recommend reading this short paper for a better understanding of
this methodology.
-Understanding the effective usage of MakeCatalog, will thus also help
effective use of other (lower-level) Gnuastro's programs like @ref{NoiseChisel}
or @ref{Segment}.
+@example
+$ astsegment [OPTION ...] InputImage.fits
+@end example
-It is important to define your regions of interest for measurements
@emph{before} running MakeCatalog.
-MakeCatalog is specialized in doing measurements accurately and efficiently.
-Therefore MakeCatalog will not do detection, segmentation, or defining
apertures on requested positions in your dataset.
-Following Gnuastro's modularity principle, there are separate and highly
specialized and customizable programs in Gnuastro for these other jobs as shown
below (for a usage example in a real-world analysis, see @ref{General program
usage tutorial} and @ref{Detecting large extended targets}).
+@noindent
+One line examples:
-@itemize
-@item
-@ref{Arithmetic}: Detection with a simple threshold.
+@example
+## Segment NoiseChisel's detected regions.
+$ astsegment default-noisechisel-output.fits
-@item
-@ref{NoiseChisel}: Advanced detection.
+## Use a hand-input S/N value for keeping true clumps
+## (avoid finding the S/N using the undetected regions).
+$ astsegment nc-out.fits --clumpsnthresh=10
-@item
-@ref{Segment}: Segmentation (substructure over detections).
+## Inspect all the segmentation steps after changing a parameter.
+$ astsegment input.fits --snquant=0.9 --checksegmentaion
-@item
-@ref{MakeProfiles}: Aperture creation for known positions.
-@end itemize
+## Use the fixed value of 0.01 for the input's Sky standard deviation
+## (in the units of the input), and assume all the pixels are a
+## detection (for example, a large structure extending over the whole
+## image), and only keep clumps with S/N>10 as true clumps.
+$ astsegment in.fits --std=0.01 --detection=all --clumpsnthresh=10
+@end example
-These programs will/can return labeled dataset(s) to be fed into MakeCatalog.
-A labeled dataset for measurement has the same size/dimensions as the input,
but with integer valued pixels that have the label/counter for each sub-set of
pixels that must be measured together.
-For example, all the pixels covering one galaxy in an image, get the same
label.
+@cindex Gaussian
+@noindent
+If Segment is to do processing (for example, you do not want to get help, or
see the values of each option), at least one input dataset is necessary along
with detection and error information, either as separate datasets (per-pixel)
or fixed values, see @ref{Segment input}.
+Segment shares a large set of common operations with other Gnuastro programs,
mainly regarding input/output, general processing steps, and general operating
modes.
+To help in a unified experience between all of Gnuastro's programs, these
common operations have the same names and defined in @ref{Common options}.
-The requested measurements are then done on similarly labeled pixels.
-The final result is a catalog where each row corresponds to the measurements
on pixels with a specific label.
-For example, the flux weighted average position of all the pixels with a label
of 42 will be written into the 42nd row of the output catalog/table's central
position column@footnote{See @ref{Measuring elliptical parameters} for a
discussion on this and the derivation of positional parameters, which includes
the center.}.
-Similarly, the sum of all these pixels will be the 42nd row in the sum column,
etc.
-Pixels with labels equal to, or smaller than, zero will be ignored by
MakeCatalog.
-In other words, the number of rows in MakeCatalog's output is already known
before running it (the maximum value of the labeled dataset).
+As in all Gnuastro programs, options can also be given to Segment in
configuration files.
+For a thorough description of Gnuastro's configuration file parsing, please
see @ref{Configuration files}.
+All of Segment's options with a short description are also always available on
the command-line with the @option{--help} option, see @ref{Getting help}.
+To inspect the option values without actually running Segment, append your
command with @option{--printparams} (or @option{-P}).
-Before getting into the details of running MakeCatalog (in @ref{Invoking
astmkcatalog}, we will start with a discussion on the basics of its approach to
separating detection from measurements in @ref{Detection and catalog
production}.
-A very important factor in any measurement is understanding its validity
range, or limits.
-Therefore in @ref{Quantifying measurement limits}, we will discuss how to
estimate the reliability of the detection and basic measurements.
-This section will continue with a derivation of elliptical parameters from the
labeled datasets in @ref{Measuring elliptical parameters}.
-For those who feel MakeCatalog's existing measurements/columns are not enough
and would like to add further measurements, in @ref{Adding new columns to
MakeCatalog}, a checklist of steps is provided for readily adding your own new
measurements/columns.
+To help in easy navigation between Segment's options, they are separately
discussed in the three sub-sections below: @ref{Segment input} discusses how
you can customize the inputs to Segment.
+@ref{Segmentation options} is devoted to options specific to the high-level
segmentation process.
+Finally, in @ref{Segment output}, we will discuss options that affect
Segment's output.
@menu
-* Detection and catalog production:: Discussing why/how to treat these
separately
-* Brightness flux magnitude:: More on Magnitudes, surface brightness, etc.
-* Quantifying measurement limits:: For comparing different catalogs.
-* Measuring elliptical parameters:: Estimating elliptical parameters.
-* Adding new columns to MakeCatalog:: How to add new columns.
-* MakeCatalog measurements:: List of all the measurements/columns by
MakeCatalog.
-* Invoking astmkcatalog:: Options and arguments to MakeCatalog.
+* Segment input:: Input files and options.
+* Segmentation options:: Parameters of the segmentation process.
+* Segment output:: Outputs of Segment
@end menu
-@node Detection and catalog production, Brightness flux magnitude,
MakeCatalog, MakeCatalog
-@subsection Detection and catalog production
-
-Most existing common tools in low-level astronomical data-analysis (for
example,
SExtractor@footnote{@url{https://www.astromatic.net/software/sextractor}})
merge the two processes of detection and measurement (catalog production) in
one program.
-However, in light of Gnuastro's modularized approach (modeled on the Unix
system) detection is separated from measurements and catalog production.
-This modularity is therefore new to many experienced astronomers and deserves
a short review here.
-Further discussion on the benefits of this methodology can be seen in
@url{https://arxiv.org/abs/1611.06387v1, Akhlaghi [2016]}.
-
-As discussed in the introduction of @ref{MakeCatalog}, detection (identifying
which pixels to do measurements on) can be done with different programs.
-Their outputs (a labeled dataset) can be directly fed into MakeCatalog to do
the measurements and write the result as a catalog/table.
-Beyond that, Gnuastro's modular approach has many benefits that will become
clear as you get more experienced in astronomical data analysis and want to be
more creative in using your valuable data for the exciting scientific project
you are working on.
-In short the reasons for this modularity can be classified as below:
-
-@itemize
-
-@item
-Simplicity/robustness of independent, modular tools: making a catalog is a
logically separate process from labeling (detection, segmentation, or aperture
production).
-A user might want to do certain operations on the labeled regions before
creating a catalog for them.
-Another user might want the properties of the same pixels/objects in another
image (another filter for example) to measure the colors or SED fittings.
-
-Here is an example of doing both: suppose you have images in various broad
band filters at various resolutions and orientations.
-The image of one color will thus not lie exactly on another or even be in the
same scale.
-However, it is imperative that the same pixels be used in measuring the colors
of galaxies.
-
-To solve the problem, NoiseChisel can be run on the reference image to
generate the labeled detection image.
-Afterwards, the labeled image can be warped into the grid of the other color
(using @ref{Warp}).
-MakeCatalog will then generate the same catalog for both colors (with the
different labeled images).
-It is currently customary to warp the images to the same pixel grid, however,
modification of the scientific dataset is very harmful for the data and creates
correlated noise.
-It is much more accurate to do the transformations on the labeled image.
+@node Segment input, Segmentation options, Invoking astsegment, Invoking
astsegment
+@subsubsection Segment input
-@item
-Complexity of a monolith: Adding in a catalog functionality to the detector
program will add several more steps (and many more options) to its processing
that can equally well be done outside of it.
-This makes following what the program does harder for the users and
developers, it can also potentially add many bugs.
+Besides the input dataset (for example, astronomical image), Segment also
needs to know the Sky standard deviation and the regions of the dataset that it
should segment.
+The values dataset is assumed to be Sky subtracted by default.
+If it is not, you can ask Segment to subtract the Sky internally by calling
@option{--sky}.
+For the rest of this discussion, we will assume it is already sky subtracted.
-As an example, if the parameter you want to measure over one profile is not
provided by the developers of MakeCatalog.
-You can simply open this tiny little program and add your desired calculation
easily.
-This process is discussed in @ref{Adding new columns to MakeCatalog}.
-However, if making a catalog was part of NoiseChisel for example, adding a new
column/measurement would require a lot of energy to understand all the steps
and internal structures of that huge program.
-It might even be so intertwined with its processing, that adding new columns
might cause problems/bugs in its primary job (detection).
+The Sky and its standard deviation can be a single value (to be used for the
whole dataset) or a separate dataset (for a separate value per pixel).
+If a dataset is used for the Sky and its standard deviation, they must either
be the size of the input image, or have a single value per tile (generated with
@option{--oneelempertile}, see @ref{Processing options} and @ref{Tessellation}).
-@end itemize
+The detected regions/pixels can be specified as a detection map (for example,
see @ref{NoiseChisel output}).
+If @option{--detection=all}, Segment will not read any detection map and
assume the whole input is a single detection.
+For example, when the dataset is fully covered by a large nearby
galaxy/globular cluster.
+When dataset are to be used for any of the inputs, Segment will assume they
are multiple extensions of a single file by default (when @option{--std} or
@option{--detection} are not called).
+For example, NoiseChisel's default output @ref{NoiseChisel output}.
+When the Sky-subtracted values are in one file, and the detection and Sky
standard deviation are in another, you just need to use @option{--detection}:
in the absence of @option{--std}, Segment will look for both the detection
labels and Sky standard deviation in the file given to @option{--detection}.
+Ultimately, if all three are in separate files, you need to call both
@option{--detection} and @option{--std}.
+The extensions of the three mandatory inputs can be specified with
@option{--hdu}, @option{--dhdu}, and @option{--stdhdu}.
+For a full discussion on what to give to these options, see the description of
@option{--hdu} in @ref{Input output options}.
+To see their default values (along with all the other options), run Segment
with the @option{--printparams} (or @option{-P}) option.
+Just recall that in the absence of @option{--detection} and @option{--std},
all three are assumed to be in the same file.
+If you only want to see Segment's default values for HDUs on your system, run
this command:
+@example
+$ astsegment -P | grep hdu
+@end example
+By default Segment will convolve the input with a kernel to improve the
signal-to-noise ratio of true peaks.
+If you already have the convolved input dataset, you can pass it directly to
Segment for faster processing (using the @option{--convolved} and
@option{--chdu} options).
+Just do not forget that the convolved image must also be Sky-subtracted before
calling Segment.
+If a value/file is given to @option{--sky}, the convolved values will also be
Sky subtracted internally.
+Alternatively, if you prefer to give a kernel (with @option{--kernel} and
@option{--khdu}), Segment can do the convolution internally.
+To disable convolution, use @option{--kernel=none}.
+@table @option
+@item --sky=STR/FLT
+The Sky value(s) to subtract from the input.
+This option can either be given a constant number or a file name containing a
dataset (multiple values, per pixel or per tile).
+By default, Segment will assume the input dataset is Sky subtracted, so this
option is not mandatory.
+If the value cannot be read as a number, it is assumed to be a file name.
+When the value is a file, the extension can be specified with
@option{--skyhdu}.
+When it is not a single number, the given dataset must either have the same
size as the output or the same size as the tessellation (so there is one pixel
per tile, see @ref{Tessellation}).
+When this option is given, its value(s) will be subtracted from the input and
the (optional) convolved dataset (given to @option{--convolved}) prior to
starting the segmentation process.
+@item --skyhdu=STR/INT
+The HDU/extension containing the Sky values.
+This is mandatory when the value given to @option{--sky} is not a number.
+Please see the description of @option{--hdu} in @ref{Input output options} for
the different ways you can identify a special extension.
-@node Brightness flux magnitude, Quantifying measurement limits, Detection and
catalog production, MakeCatalog
-@subsection Brightness, Flux, Magnitude and Surface brightness
+@item --std=STR/FLT
+The Sky standard deviation value(s) corresponding to the input.
+The value can either be a constant number or a file name containing a dataset
(multiple values, per pixel or per tile).
+The Sky standard deviation is mandatory for Segment to operate.
-@cindex ADU
-@cindex Gain
-@cindex Counts
-Astronomical data pixels are usually in units of counts@footnote{Counts are
also known as analog to digital units (ADU).} or electrons or either one
divided by seconds.
-To convert from the counts to electrons, you will need to know the instrument
gain.
-In any case, they can be directly converted to energy or energy/time using the
basic hardware (telescope, camera and filter) information (that is summarized
in the @emph{zero point}, and we will discuss below).
-We will continue the discussion assuming the pixels are in units of
energy/time.
+If the value cannot be read as a number, it is assumed to be a file name.
+When the value is a file, the extension can be specified with
@option{--skyhdu}.
+When it is not a single number, the given dataset must either have the same
size as the output or the same size as the tessellation (so there is one pixel
per tile, see @ref{Tessellation}).
-@table @asis
-@cindex Flux
-@cindex Luminosity
-@cindex Brightness
-@item Brightness
-The @emph{brightness} of an object is defined as its measured energy in units
of time.
-If our detector pixels directly measured the energy from the astronomical
object@footnote{In practice, the measured pixels don't just count the
astronomical object's energy: imaging detectors insert a certain bias level
before the exposure, they amplify the photo-electrons, there are optical
artifacts like flat-fielding, and finally, there is the background light.},
then the brightness would be the total sum of pixel values (energy) associated
to the object, divided by the exposure time.
-The @emph{flux} of an object is defined in units of
energy/time/collecting-area.
-For an astronomical target, the flux is therefore defined as its brightness
divided by the area used to collect the light from the source; or the telescope
aperture (for example, in units of @mymath{cm^2}).
-Knowing the flux (@mymath{f}) and distance to the object (@mymath{r}), we can
define its @emph{luminosity}: @mymath{L=4{\pi}r^2f}.
+When this option is not called, Segment will assume the standard deviation is
a dataset and in a HDU/extension (@option{--stdhdu}) of another one of the
input file(s).
+If a file is given to @option{--detection}, it will assume that file contains
the standard deviation dataset, otherwise, it will look into input filename
(the main argument, without any option).
-Therefore, while flux and luminosity are intrinsic properties of the object,
brightness depends on our detecting tools (hardware and software).
-In low-level observational astronomy data analysis, we are usually more
concerned with measuring the brightness, because it is the thing we directly
measure from the image pixels and create in catalogs.
-On the other hand, luminosity is used in higher-level analysis (after image
contents are measured as catalogs to deduce physical interpretations, because
high-level things like distance/redshift need to be calculated).
-At this stage, it is just important avoid confusion between luminosity and
brightness because both have the same units of energy per seconds.
+@item --stdhdu=INT/STR
+The HDU/extension containing the Sky standard deviation values, when the value
given to @option{--std} is a file name.
+Please see the description of @option{--hdu} in @ref{Input output options} for
the different ways you can identify a special extension.
-@item Magnitude
-@cindex Magnitudes from flux
-@cindex Flux to magnitude conversion
-@cindex Astronomical Magnitude system
-Images of astronomical objects span over a very large range of brightness: the
Sun (as the brightest object) is roughly @mymath{2.5^{60}=10^{24}} times
brighter than the fainter galaxies we can currently detect in the deepest
images.
-Therefore discussing brightness directly will involve a large range of values
which is inconvenient.
-So astronomers have chosen to use a logarithmic scale for the brightness of
astronomical objects.
+@item --variance
+The input Sky standard deviation value/dataset is actually variance.
+When this option is called, the square root of input Sky standard deviation
(see @option{--std}) is used internally, not its raw value(s).
-@cindex Hipparchus of Nicaea
-But the logarithm can only be usable with a dimensionless value that is always
positive.
-Fortunately brightness is always positive (at least in theory@footnote{In
practice, for very faint objects, if the background brightness is
over-subtracted, we may end up with a negative ``brightness'' or sum of pixels
in a real object.}).
-To remove the dimensions, we divide the brightness of the object (@mymath{B})
by a reference brightness (@mymath{B_r}).
-We then define a logarithmic scale as @mymath{magnitude} through the relation
below.
-The @mymath{-2.5} factor in the definition of magnitudes is a legacy of the
our ancient colleagues and in particular Hipparchus of Nicaea (190-120 BC).
+@item -d FITS
+@itemx --detection=FITS
+Detection map to use for segmentation.
+If given a value of @option{all}, Segment will assume the whole dataset must
be segmented, see below.
+If a detection map is given, the extension can be specified with
@option{--dhdu}.
+If not given, Segment will assume the desired HDU/extension is in the main
input argument (input file specified with no option).
-@dispmath{m-m_r=-2.5\log_{10} \left( B \over B_r \right)}
+The final segmentation (clumps or objects) will only be over the non-zero
pixels of this detection map.
+The dataset must have the same size as the input image.
+Only datasets with an integer type are acceptable for the labeled image, see
@ref{Numeric data types}.
+If your detection map only has integer values, but it is stored in a floating
point container, you can use Gnuastro's Arithmetic program (see
@ref{Arithmetic}) to convert it to an integer container, like the example below:
-@noindent
-@mymath{m} is defined as the magnitude of the object and @mymath{m_r} is the
pre-defined magnitude of the reference brightness.
-For estimating the error in measuring a magnitude, see @ref{Quantifying
measurement limits}.
+@example
+$ astarithmetic float.fits int32 --output=int.fits
+@end example
-@item Zero point
-@cindex Zero point magnitude
-@cindex Magnitude zero point
-A unique situation in the magnitude equation above occurs when the reference
brightness is unity (@mymath{B_r=1}).
-This brightness will thus summarize all the hardware-specific parameters
discussed above (like the conversion of pixel values to physical units) into
one number.
-That reference magnitude is commonly known as the @emph{Zero point} magnitude
because when @mymath{B=B_r=1}, the right side of the magnitude definition above
will be zero.
-Using the zero point magnitude (@mymath{Z}), we can write the magnitude
relation above in a more simpler format:
+It may happen that the whole input dataset is covered by signal, for example,
when working on parts of the Andromeda galaxy, or nearby globular clusters
(that cover the whole field of view).
+In such cases, segmentation is necessary over the complete dataset, not just
specific regions (detections).
+By default Segment will first use the undetected regions as a reference to
find the proper signal-to-noise ratio of ``true'' clumps (give a purity level
specified with @option{--snquant}).
+Therefore, in such scenarios you also need to manually give a ``true'' clump
signal-to-noise ratio with the @option{--clumpsnthresh} option to disable
looking into the undetected regions, see @ref{Segmentation options}.
+In such cases, is possible to make a detection map that only has the value
@code{1} for all pixels (for example, using @ref{Arithmetic}), but for
convenience, you can also use @option{--detection=all}.
-@dispmath{m = -2.5\log_{10}(B) + Z}
+@item --dhdu
+The HDU/extension containing the detection map given to @option{--detection}.
+Please see the description of @option{--hdu} in @ref{Input output options} for
the different ways you can identify a special extension.
-@cindex Janskys (Jy)
-@cindex AB magnitude
-@cindex Magnitude, AB
-Gnuastro has an installed script to estimate the zero point of any image, see
@ref{Zero point estimation} (it contains practical tutorials to help you get
started fast).
-Having the zero point of an image, you can convert its pixel values to
physical units like microJanskys (or @mymath{\mu{}Jy}).
-This enables direct pixel-based comparisons with images from other
instruments@footnote{Comparing data from different instruments assumes
instrument and observation signatures are properly corrected, things like the
flat-field or the Sky absorption.
-It is also valid for pixel values, assuming that factors that can change the
morphology (like the @ref{PSF}) are the same.}.
-Jansky is a commonly used unit for measuring spectral flux density and one
Jansky is equivalent to @mymath{10^{-26} W/m^2/Hz} (watts per square meter per
hertz).
+@item -k FITS
+@itemx --kernel=FITS
+The name of file containing kernel that will be used to convolve the input
image.
+The usage of this option is identical to NoiseChisel's @option{--kernel}
option (@ref{NoiseChisel input}).
+Please see the descriptions there for more.
+To disable convolution, you can give it a value of @option{none}.
-This conversion can be done with the fact that in the AB magnitude
standard@footnote{@url{https://en.wikipedia.org/wiki/AB_magnitude}},
@mymath{3631Jy} corresponds to the zero-th magnitude, therefore
@mymath{B\equiv3631\times10^{6}\mu{Jy}} and @mymath{m\equiv0}.
-We can therefore estimate the brightness (@mymath{B_z}, in @mymath{\mu{Jy}})
corresponding to the image zero point (@mymath{Z}) using this equation:
+@item --khdu
+The HDU/extension containing the kernel used for convolution.
+For acceptable values, please see the description of @option{--hdu} in
@ref{Input output options}.
-@dispmath{m - Z = -2.5\log_{10}(B/B_z)}
-@dispmath{0 - Z = -2.5\log_{10}({3631\times10^{6}\over B_z})}
-@dispmath{B_z = 3631\times10^{\left(6 - {Z \over 2.5} \right)} \mu{Jy}}
+@item --convolved=FITS
+The convolved image's file name to avoid internal convolution by Segment.
+The usage of this option is identical to NoiseChisel's @option{--convolved}
option.
+Please see @ref{NoiseChisel input} for a thorough discussion of the usefulness
and best practices of using this option.
-@cindex SDSS
-Because the image zero point corresponds to a pixel value of @mymath{1}, the
@mymath{B_z} value calculated above also corresponds to a pixel value of
@mymath{1}.
-Therefore you simply have to multiply your image by @mymath{B_z} to convert it
to @mymath{\mu{Jy}}.
-Do Not forget that this only applies when your zero point was also estimated
in the AB magnitude system.
-On the command-line, you can estimate this value for a certain zero point with
AWK, then multiply it to all the pixels in the image with @ref{Arithmetic}.
-For example, let's assume you are using an SDSS image with a zero point of
22.5:
+If you want to use the same convolution kernel for detection (with
@ref{NoiseChisel}) and segmentation, with this option, you can use the same
convolved image (that is also available in NoiseChisel) and avoid two
convolutions.
+However, just be careful to use the input to NoiseChisel as the input to
Segment also, then use the @option{--sky} and @option{--std} to specify the Sky
and its standard deviation (from NoiseChisel's output).
+Recall that when NoiseChisel is not called with @option{--rawoutput}, the
first extension of NoiseChisel's output is the @emph{Sky-subtracted} input (see
@ref{NoiseChisel output}).
+So if you use the same convolved image that you fed to NoiseChisel, but use
NoiseChisel's output with Segment's @option{--convolved}, then the convolved
image will not be Sky subtracted.
-@example
-bz=$(echo 22.5 | awk '@{print 3631 * 10^(6-$1/2.5)@}')
-astarithmetic sdss.fits $bz x --output=sdss-in-muJy.fits
-@end example
+@item --chdu
+The HDU/extension containing the convolved image (given to
@option{--convolved}).
+For acceptable values, please see the description of @option{--hdu} in
@ref{Input output options}.
-@noindent
-But in Gnuastro, it gets even easier: Arithmetic has an operator called
@code{counts-to-jy}.
-This will directly convert your image pixels (in units of counts) to Janskys
though a provided AB Magnitude-based zero point like below.
-See @ref{Arithmetic operators} for more.
+@item -L INT[,INT]
+@itemx --largetilesize=INT[,INT]
+The size of the large tiles to use for identifying the clump S/N threshold
over the undetected regions.
+The usage of this option is identical to NoiseChisel's
@option{--largetilesize} option (@ref{NoiseChisel input}).
+Please see the descriptions there for more.
-@example
-$ astarithmetic sdss.fits 22.5 counts-to-jy
-@end example
+The undetected regions can be a significant fraction of the dataset and
finding clumps requires sorting of the desired regions, which can be slow.
+To speed up the processing, Segment finds clumps in the undetected regions
over separate large tiles.
+This allows it to have to sort a much smaller set of pixels and also to treat
them independently and in parallel.
+Both these issues greatly speed it up.
+Just be sure to not decrease the large tile sizes too much (less than 100
pixels in each dimension).
+It is important for them to be much larger than the clumps.
-@cartouche
-@noindent
-@strong{Be careful with the exposure time:} as described at the start of this
section, we are assuming your data are in units of counts/sec.
-As a result, the counts you get from the command above, are only for one
second of exposure!
-Please see the discussion below in ``Magnitude to counts'' for more.
-@end cartouche
+@end table
-@item Magnitude to counts (accounting for exposure time)
-@cindex Exposure time
-Until now, we had assumed that the data are in units of counts/sec.
-As a result, the equations given above (in the ``Zero point'' item to convert
magnitudes to pixel counts), give the count level for the reference (1 second)
exposure.
-But we rarely take 1 second exposures!
-It is therefore very important to take the exposure time into account in
scenarios like simulating observations with varying exposure times (where you
need to know how many counts the object of a certain magnitude will add to a
certain image with a certain exposure time).
-To clarify the concept, let's define @mymath{C} as the @emph{counted}
electrons (which has a linear relation with the photon energy entering the CCD
pixel).
-In this case, if an object of brightness @mymath{B} is observed for @mymath{t}
seconds, it will accumulate @mymath{C=B\times t} counts@footnote{Recall that
counts another name for ADUs, which already includes the CCD gain.}.
-Therefore, the generic magnitude equation above can be written as:
-@dispmath{m = -2.5\log_{10}(B) + Z = -2.5\log_{10}(C/t) + Z}
-@noindent
-From this, we can derive @mymath{C(t)} in relation to @mymath{C(1)}, or counts
from a 1 second exposure, using this relation:
-@dispmath{C(t) = t\times10^{(m-Z)/2.5} = t\times C(1)}
-In other words, you should simply multiply the counts for one second with the
number of observed seconds.
+@node Segmentation options, Segment output, Segment input, Invoking astsegment
+@subsubsection Segmentation options
-Another approach is to shift the time-dependence of the counts into the zero
point (after all exposure time is also a hardware issue).
-Let's derive the equation below:
-@dispmath{m = -2.5\log_{10}(C/t) + Z = -2.5\log_{10}(C) + 2.5\log_{10}(t) + Z}
-Therefore, defining an exposure-time-dependent zero point as @mymath{Z(t)}, we
can directly correlate a certain object's magnitude with counts after an
exposure of @mymath{t} seconds:
-@dispmath{m = -2.5\log_{10}(C) + Z(t) \quad\rm{where}\quad Z(t)=Z +
2.5\log_{10}(t)}
-This solution is useful in programs like @ref{MakeCatalog} or
@ref{MakeProfiles}, when you cannot (or do not want to: because of the extra
storage/speed costs) manipulate the values image (for example, divide it by the
exposure time to use a counts/sec zero point).
+The options below can be used to configure every step of the segmentation
process in the Segment program.
+For a more complete explanation (with figures to demonstrate each step),
please see Section 3.2 of @url{https://arxiv.org/abs/1505.01664, Akhlaghi and
Ichikawa [2015]}, and also @ref{Segment}.
+By default, Segment will follow the procedure described in the paper to find
the S/N threshold based on the noise properties.
+This can be disabled by directly giving a trustable signal-to-noise ratio to
the @option{--clumpsnthresh} option.
-@item Surface brightness
-@cindex Steradian
-@cindex Angular coverage
-@cindex Celestial sphere
-@cindex Surface brightness
-@cindex SI (International System of Units)
-Another important concept is the distribution of an object's brightness over
its area.
-For this, we define the @emph{surface brightness} to be the magnitude of an
object's brightness divided by its solid angle over the celestial sphere (or
coverage in the sky, commonly in units of arcsec@mymath{^2}).
-The solid angle is expressed in units of arcsec@mymath{^2} because
astronomical targets are usually much smaller than one steradian.
-Recall that the steradian is the dimension-less SI unit of a solid angle and 1
steradian covers @mymath{1/4\pi} (almost @mymath{8\%}) of the full celestial
sphere.
+Recall that you can always see the full list of Gnuastro's options with the
@option{--help} (see @ref{Getting help}), or @option{--printparams} (or
@option{-P}) to see their values (see @ref{Operating mode options}).
-Surface brightness is therefore most commonly expressed in units of
mag/arcsec@mymath{^2}.
-For example, when the brightness is measured over an area of A
arcsec@mymath{^2}, then the surface brightness becomes:
+@table @option
-@dispmath{S = -2.5\log_{10}(B/A) + Z = -2.5\log_{10}(B) + 2.5\log_{10}(A) + Z}
+@item -B FLT
+@itemx --minskyfrac=FLT
+Minimum fraction (value between 0 and 1) of Sky (undetected) areas in a large
tile.
+Only (large) tiles with a fraction of undetected pixels (Sky) greater than
this value will be used for finding clumps.
+The clumps found in the undetected areas will be used to estimate a S/N
threshold for true clumps.
+Therefore this is an important option (to decrease) in crowded fields.
+Operationally, this is almost identical to NoiseChisel's @option{--minskyfrac}
option (@ref{Detection options}).
+Please see the descriptions there for more.
-@noindent
-In other words, the surface brightness (in units of mag/arcsec@mymath{^2}) is
related to the object's magnitude (@mymath{m}) and area (@mymath{A}, in units
of arcsec@mymath{^2}) through this equation:
+@item --minima
+Build the clumps based on the local minima, not maxima.
+By default, clumps are built starting from local maxima (see Figure 8 of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}).
+Therefore, this option can be useful when you are searching for true local
minima (for example, absorption features).
-@dispmath{S = m + 2.5\log_{10}(A)}
+@item -m INT
+@itemx --snminarea=INT
+The minimum area which a clump in the undetected regions should have in order
to be considered in the clump Signal to noise ratio measurement.
+If this size is set to a small value, the Signal to noise ratio of false
clumps will not be accurately found.
+It is recommended that this value be larger than the value to NoiseChisel's
@option{--snminarea}.
+Because the clumps are found on the convolved (smoothed) image while the
pseudo-detections are found on the input image.
+You can use @option{--checksn} and @option{--checksegmentation} to see if your
chosen value is reasonable or not.
-A common mistake is to follow the mag/arcsec@mymath{^2} unit literally, and
divide the object's magnitude by its area.
-But this is wrong because magnitude is a logarithmic scale while area is
linear.
-It is the brightness that should be divided by the solid angle because both
have linear scales.
-The magnitude of that ratio is then defined to be the surface brightness.
+@item --checksn
+Save the S/N values of the clumps over the sky and detected regions into
separate tables.
+If @option{--tableformat} is a FITS format, each table will be written into a
separate extension of one file suffixed with @file{_clumpsn.fits}.
+If it is plain text, a separate file will be made for each table (ending in
@file{_clumpsn_sky.txt} and @file{_clumpsn_det.txt}).
+For more on @option{--tableformat} see @ref{Input output options}.
-One usual application of this is to convert an image's pixel values to surface
brightness, when you know its zero point.
-This can be done with the two simple commands below.
-First, we derive the pixel area (in arcsec@mymath{^2}) then we use Arithmetic
to convert the pixels into surface brightness, see below for the details.
+You can use these tables to inspect the S/N values and their distribution (in
combination with the @option{--checksegmentation} option to see where the
clumps are).
+You can use Gnuastro's @ref{Statistics} to make a histogram of the
distribution (ready for plotting in a text file, or a crude ASCII-art
demonstration on the command-line).
-@example
-$ zeropoint=22.5
-$ pixarea=$(astfits image.fits --pixelareaarcsec2)
-$ astarithmetic image.fits $zeropoint $pixarea counts-to-sb \
- --output=image-sb.fits
-@end example
+With this option, Segment will abort as soon as the two tables are created.
+This allows you to inspect the steps leading to the final S/N quantile
threshold, this behavior can be disabled with @option{--continueaftercheck}.
-See @ref{Reverse polish notation} for more on Arithmetic's notation and
@ref{Arithmetic operators} for a description of each operator.
-And see @ref{FITS images in a publication} for a fully working tutorial on how
to optimally convert a FITS image to a PDF image for usage in a publication
using the surface brightness conversion shown above.
+@item --minnumfalse=INT
+The minimum number of clumps over undetected (Sky) regions to identify the
requested Signal-to-Noise ratio threshold.
+Operationally, this is almost identical to NoiseChisel's
@option{--minnumfalse} option (@ref{Detection options}).
+Please see the descriptions there for more.
-@cartouche
-@noindent
-@strong{Do Not warp or convolve magnitude or surface brightness images:}
Warping an image involves calculating new pixel values (of the new pixel grid)
from the old pixel values.
-Convolution is also a process of finding the weighted mean of pixel values.
-During these processes, many arithmetic operations are done on the original
pixel values, for example, addition or multiplication.
-However, @mymath{log_{10}(a+b)\ne log_{10}(a)+log_{10}(b)}.
-Therefore after calculating a magnitude or surface brightness image, do not
apply any such operations on it!
-If you need to warp or convolve the image, do it @emph{before} the conversion.
-@end cartouche
-@end table
+@item -c FLT
+@itemx --snquant=FLT
+The quantile of the signal-to-noise ratio distribution of clumps in undetected
regions, used to define true clumps.
+After identifying all the usable clumps in the undetected regions of the
dataset, the given quantile of their signal-to-noise ratios is used to define
the signal-to-noise ratio of a ``true'' clump.
+Effectively, this can be seen as an inverse p-value measure.
+See Figure 9 and Section 3.2.1 of @url{https://arxiv.org/abs/1505.01664,
Akhlaghi and Ichikawa [2015]} for a complete explanation.
+The full distribution of clump signal-to-noise ratios over the undetected
areas can be saved into a table with @option{--checksn} option and visually
inspected with @option{--checksegmentation}.
+
+@item -v
+@itemx --keepmaxnearriver
+Keep a clump whose maximum (minimum if @option{--minima} is called) flux is
8-connected to a river pixel.
+By default such clumps over detections are considered to be noise and are
removed irrespective of their significance measure (see
@url{https://arxiv.org/abs/1909.11230,Akhlaghi 2019}).
+Over large profiles, that sink into the noise very slowly, noise can cause
part of the profile (which was flat without noise) to become a very large and
with a very high Signal to noise ratio.
+In such cases, the pixel with the maximum flux in the clump will be
immediately touching a river pixel.
+@item -s FLT
+@itemx --clumpsnthresh=FLT
+The signal-to-noise threshold for true clumps.
+If this option is given, then the segmentation options above will be ignored
and the given value will be directly used to identify true clumps over the
detections.
+This can be useful if you have a large dataset with similar noise properties.
+You can find a robust signal-to-noise ratio based on a (sufficiently large)
smaller portion of the dataset.
+Afterwards, with this option, you can speed up the processing on the whole
dataset.
+Other scenarios where this option may be useful is when, the image might not
contain enough/any Sky regions.
+@item -G FLT
+@itemx --gthresh=FLT
+Threshold (multiple of the sky standard deviation added with the sky) to stop
growing true clumps.
+Once true clumps are found, they are set as the basis to segment the detected
region.
+They are grown until the threshold specified by this option.
+@item -y INT
+@itemx --minriverlength=INT
+The minimum length of a river between two grown clumps for it to be considered
in signal-to-noise ratio estimations.
+Similar to @option{--snminarea}, if the length of the river is too short, the
signal-to-noise ratio can be noisy and unreliable.
+Any existing rivers shorter than this length will be considered as
non-existent, independent of their Signal to noise ratio.
+The clumps are grown on the input image, therefore this value can be smaller
than the value given to @option{--snminarea}.
+Recall that the clumps were defined on the convolved image so
@option{--snminarea} should be larger.
+@item -O FLT
+@itemx --objbordersn=FLT
+The maximum Signal to noise ratio of the rivers between two grown clumps in
order to consider them as separate `objects'.
+If the Signal to noise ratio of the river between two grown clumps is larger
than this value, they are defined to be part of one `object'.
+Note that the physical reality of these `objects' can never be established
with one image, or even multiple images from one broad-band filter.
+Any method we devise to define `object's over a detected region is ultimately
subjective.
-@node Quantifying measurement limits, Measuring elliptical parameters,
Brightness flux magnitude, MakeCatalog
-@subsection Quantifying measurement limits
+Two very distant galaxies or satellites in one halo might lie in the same line
of sight and be detected as clumps on one detection.
+On the other hand, the connection (through a spiral arm or tidal tail for
example) between two parts of one galaxy might have such a low surface
brightness that they are broken up into multiple detections or objects.
+In fact if you have noticed, exactly for this purpose, this is the only Signal
to noise ratio that the user gives into NoiseChisel.
+The `true' detections and clumps can be objectively identified from the noise
characteristics of the image, so you do not have to give any hand input Signal
to noise ratio.
-@cindex Depth of data
-@cindex Clump magnitude limit
-@cindex Object magnitude limit
-@cindex Limit, object/clump magnitude
-@cindex Magnitude, object/clump detection limit
-No measurement on a real dataset can be perfect: you can only reach a certain
level/limit of accuracy and a meaningful (scientific) analysis requires an
understanding of these limits.
-Different datasets have different noise properties and different detection
methods (one method/algorithm/software that is run with a different set of
parameters is considered as a different detection method) will have different
abilities to detect or measure certain kinds of signal (astronomical objects)
and their properties in the dataset.
-Hence, quantifying the detection and measurement limitations with a particular
dataset and analysis tool is the most crucial/critical aspect of any high-level
analysis.
-In two separate tutorials, we have touched upon some of these points.
-So to see the discussions below in action (on real data), see @ref{Measuring
the dataset limits} and @ref{Image surface brightness limit}.
+@item --checksegmentation
+A file with the suffix @file{_seg.fits} will be created.
+This file keeps all the relevant steps in finding true clumps and segmenting
the detections into multiple objects in various extensions.
+Having read the paper or the steps above.
+Examining this file can be an excellent guide in choosing the best set of
parameters.
+Note that calling this function will significantly slow NoiseChisel.
+In verbose mode (without the @option{--quiet} option, see @ref{Operating mode
options}) the important steps (along with their extension names) will also be
reported.
-Here, we will review some of the most commonly used methods to quantify the
limits in astronomical data analysis and how MakeCatalog makes it easy to
measure them.
-Depending on the higher-level analysis, there are more tests that must be
done, but these are relatively low-level and usually necessary in most cases.
-In astronomy, it is common to use the magnitude (a unit-less scale) and
physical units, see @ref{Brightness flux magnitude}.
-Therefore the measurements discussed here are commonly used in units of
magnitudes.
+With this option, NoiseChisel will abort as soon as the two tables are created.
+This behavior can be disabled with @option{--continueaftercheck}.
-@menu
-* Standard deviation vs error:: The std is not a measure of the error.
-* Magnitude measurement error of each detection:: Error in measuring
magnitude.
-* Surface brightness error of each detection:: Error in measuring the Surface
brightness.
-* Completeness limit of each detection:: Possibility of detecting similar
objects?
-* Upper limit magnitude of each detection:: How reliable is your magnitude?
-* Magnitude limit of image:: Measured magnitude of objects at certain S/N.
-* Surface brightness limit of image:: Extrapolate per-pixel noise-level to
standard units.
-* Upper limit magnitude of image:: Measure the noise-level for a certain
aperture.
-@end menu
+@end table
-@node Standard deviation vs error, Magnitude measurement error of each
detection, Quantifying measurement limits, Quantifying measurement limits
-@subsubsection Standard deviation vs error
-The error and the standard deviation are sometimes confused with each other.
-Therefore, before continuing with the various measurement limits below, let's
review these two fundamental concepts.
-Instead of going into the theoretical defitions of the two (which you can see
in their resepctive Wikipedia pages), we'll discuss the concepts in a hands-on
and practical way here.
+@node Segment output, , Segmentation options, Invoking astsegment
+@subsubsection Segment output
-Let's simulate an observation of the sky, but without any astronomical sources!
-In other words, where we only a background flux level (from the sky emission).
-With the first command below, let's make an image called @file{1.fits} that
contains @mymath{200\times200} pixels that are filled with random noise from a
Poisson distribution with a mean of 100 counts (the flux from the background
sky).
-Recall that the Poisson distribution is equal to a normal distribution for
larger mean values (as in this case).
+The main output of Segment are two label datasets (with integer types,
separating the dataset's elements into different classes).
+They have HDU/extension names of @code{CLUMPS} and @code{OBJECTS}.
-The standard deviation (@mymath{\sigma}) of the Poisson distribution is the
square root of the mean, see @ref{Photon counting noise}.
-With the second command, we'll have a look at the image.
-Note that due to the random nature of the noise, the values reported in the
next steps on your computer will be very slightly different.
-To reproducible exactly the same values in different runs, see @ref{Generating
random numbers}, and for more on the first command, see @ref{Arithmetic}.
+Similar to all Gnuastro's FITS outputs, the zero-th extension/HDU of the main
output file only contains header keywords and image or table.
+It contains the Segment input files and parameters (option names and values)
as FITS keywords.
+Note that if an option name is longer than 8 characters, the keyword name is
the second word.
+The first word is @code{HIERARCH}.
+Also note that according to the FITS standard, the keyword names must be in
capital letters, therefore, if you want to use Grep to inspect these keywords,
use the @option{-i} option, like the example below.
@example
-$ astarithmetic 200 200 2 makenew 100 mknoise-poisson \
- --output=1.fits
-
-$ astscript-fits-view 1.fits
+$ astfits image_segmented.fits -h0 | grep -i snquant
@end example
-Each pixel shows the result of one sampling from the Poisson distribution.
-In other words, assuming the sky emission in our simulation is constant over
our field of view, each pixel's value shows one measurement of the sky emission.
-Statistically speaking, a ``measurement'' is a sampling from an underlying
distribution of values.
-Through our measurements, we aim to identfy that underlying distribution (the
``truth'')!
-With the command below, let's look at the pixel statistics of @file{1.fits}
(output is shown immediately under it).
+@cindex DS9
+@cindex SAO DS9
+By default, besides the @code{CLUMPS} and @code{OBJECTS} extensions, Segment's
output will also contain the (technically redundant) input dataset and the sky
standard deviation dataset (if it was not a constant number).
+This can help in visually inspecting the result when viewing the images as a
``Multi-extension data cube'' in SAO DS9 for example, (see @ref{Viewing FITS
file contents with DS9 or TOPCAT}).
+You can simply flip through the extensions and see the same region of the
image and its corresponding clumps/object labels.
+It also makes it easy to feed the output (as one file) into MakeCatalog when
you intend to make a catalog afterwards (see @ref{MakeCatalog}.
+To remove these redundant extensions from the output (for example, when
designing a pipeline), you can use @option{--rawoutput}.
-@c If you change this output, replace the standard deviation (10.09) below
-@c in the text.
-@example
-$ aststatistics 1.fits
-Statistics (GNU Astronomy Utilities) @value{VERSION}
--------
-Input: 1.fits (hdu: 1)
--------
- Number of elements: 40000
- Minimum: -4.72824245470431e+01
- Maximum: 4.24861780263050e+01
- Mode: 0.09274776246
- Mode quantile: 0.5004125103
- Median: 8.36190404450713e-02
- Mean: 0.098637593
- Standard deviation: 10.09065298
--------
-Histogram:
- | * ****
- | *********
- | ************
- | **************
- | *****************
- | ********************
- | ***********************
- | **************************
- | ******************************
- | **************************************
- |* * *********************************************************** * *
- |----------------------------------------------------------------------
-@end example
+The @code{OBJECTS} and @code{CLUMPS} extensions can be used as input into
@ref{MakeCatalog} to generate a catalog for higher-level analysis.
+If you want to treat each clump separately, you can give a very large value
(or even a NaN, which will always fail) to the @option{--gthresh} option (for
example, @code{--gthresh=1e10} or @code{--gthresh=nan}), see @ref{Segmentation
options}.
-As expected, you see that the ASCII histogram nicely resembles a normal
distribution.
-The measured mean and standard deviation (@mymath{\sigma_x}) are also very
similar to the input (mean of 100, standard deviation of @mymath{\sigma=10}).
-But the measured mean (and standard deviation) aren't exactly equal to the
input!
+For a complete definition of clumps and objects, please see Section 3.2 of
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]} and
@ref{Segmentation options}.
+The clumps are ``true'' local maxima (minima if @option{--minima} is called)
and their surrounding pixels until a local minimum/maximum (caused by noise
fluctuations, or another ``true'' clump).
+Therefore it may happen that some of the input detections are not covered by
clumps at all (very diffuse objects without any strong peak), while some
objects may contain many clumps.
+Even in those that have clumps, there will be regions that are too diffuse.
+The diffuse regions (within the input detected regions) are given a negative
label (-1) to help you separate them from the undetected regions (with a value
of zero).
-Every time we make a different simulated image from the same distribution, the
measured mean and standrad deviation will slightly differ.
-With the second command below, let's build 500 images like above and measure
their mean and standard deviation.
-The outputs will be written into a file (@file{mean-stds.txt}; in the first
command we are deleting it to make sure we write into an empty file within the
loop).
-With the third command, let's view the top 10 rows:
+Each clump is labeled with respect to its host object.
+Therefore, if an object has three clumps for example, the clumps within it
have labels 1, 2 and 3.
+As a result, if an initial detected region has multiple objects, each with a
single clump, all the clumps will have a label of 1.
+The total number of clumps in the dataset is stored in the @code{NCLUMPS}
keyword of the @code{CLUMPS} extension and printed in the verbose output of
Segment (when @option{--quiet} is not called).
-@example
-$ rm -f mean-stds.txt
-$ for i in $(seq 500); do \
- astarithmetic 200 200 2 makenew 100 mknoise-poisson \
- --output=$i.fits --quiet; \
- aststatistics $i.fits --mean --std >> mean-stds.txt; \
- echo "$i: complete"; \
- done
+The @code{OBJECTS} extension of the output will give a positive counter/label
to every detected pixel in the input.
+As described in Akhlaghi and Ichikawa [2015], the true clumps are grown until
a certain threshold.
+If the grown clumps touch other clumps and the connection is strong enough,
they are considered part of the same @emph{object}.
+Once objects (grown clumps) are identified, they are grown to cover the whole
detected area.
-$ asttable mean-stds.txt -Y --head=10
-99.989381 9.936407
-100.036622 10.059997
-100.006054 9.985470
-99.944535 9.960069
-100.050318 9.970116
-100.002718 9.905395
-100.067555 9.964038
-100.027167 10.018562
-100.051951 9.995859
-100.000212 9.970293
-@end example
+The options to configure the output of Segment are listed below:
-From this table, you see that each simulation has produced a slightly
different measured mean and measured standard deviation (@mymath{\sigma_x})
that are just fluctuating around the input mean (which was 100) and input
standard deviation (@mymath{\sigma=10}).
-Let's have a look at the distribution of mean measurements:
+@table @option
+@item --continueaftercheck
+Do Not abort Segment after producing the check image(s).
+The usage of this option is identical to NoiseChisel's
@option{--continueaftercheck} option (@ref{NoiseChisel input}).
+Please see the descriptions there for more.
-@example
-$ aststatistics mean-stds.txt -c1
-Statistics (GNU Astronomy Utilities) @value{VERSION}
--------
-Input: mean-stds.txt
-Column: 1
--------
- Number of elements: 500
- Minimum: 9.98183528700191e+01
- Maximum: 1.00146490891332e+02
- Mode: 99.99709739
- Mode quantile: 0.49498998
- Median: 9.99977393190436e+01
- Mean: 99.99891826
- Standard deviation: 0.04901635275
--------
-Histogram:
- | *
- | * **
- | ****** **** * *
- | ****** **** * * *
- | * * ************* * *
- | * ****************** **
- | * ********************* *** *
- | * ***************************** ***
- | *** ********************************** *
- | *** ******************************************* **
- | * ************************************************* ** *
- |----------------------------------------------------------------------
-@end example
+@item --noobjects
+Abort Segment after finding true clumps and do not continue with finding
options.
+Therefore, no @code{OBJECTS} extension will be present in the output.
+Each true clump in @code{CLUMPS} will get a unique label, but diffuse regions
will still have a negative value.
-@cindex Standard error of mean
-The standard deviation of the various mean measurements above shows the
scatter in measuring the mean with an image of this size from this underlying
distribution.
-This is therefore defined as the @emph{standard error of the mean}, or
``error'' for short (since most measurements are actually the mean of a
population) and shown with @mymath{\widehat\sigma_{\bar{x}}}.
+To make a catalog of the clumps, the input detection map (where all the labels
are one) can be fed into @ref{MakeCatalog} along with the input detection map
to Segment (that only had a value of @code{1} for all detected pixels) with
@option{--clumpscat}.
+In this way, MakeCatalog will assume all the clumps belong to a single
``object''.
-From the example above, you see that the error is smaller than the standard
deviation (smaller when you have a larger sample).
-In fact, @url{https://en.wikipedia.org/wiki/Standard_error#Derivation, it can
be shown} that this ``error of the mean'' (@mymath{\sigma_{\bar{x}}}) is
related to the distribution standard deviation (@mymath{\sigma}) through the
following equation.
-Where @mymath{N} is the number of points used to measure the mean in one
sample (@mymath{200\times200=40000} in this case).
-Note that the @mymath{10.09} below was reported as ``standard deviation'' in
the first run of @code{aststatistics} on @file{1.fits} above):
+@item --grownclumps
+In the output @code{CLUMPS} extension, store the grown clumps.
+If a detected region contains no clumps or only one clump, then it will be
fully given a label of @code{1} (no negative valued pixels).
-@c The 10.09 depends on the 'aststatistics 1.fits' command above.
-@dispmath{\sigma_{\bar{x}}=\frac{\sigma}{\sqrt{N}} \quad\quad {\rm or}
\quad\quad \widehat\sigma_{\bar{x}}\approx\frac{\sigma_x}{\sqrt{N}} =
\frac{10.09}{200} = 0.05}
+@item --rawoutput
+Only write the @code{CLUMPS} and @code{OBJECTS} datasets in the output file.
+Without this option (by default), the first and last extensions of the output
will the Sky-subtracted input dataset and the Sky standard deviation dataset
(if it was not a number).
+When the datasets are small, these redundant extensions can make it convenient
to inspect the results visually or feed the output to @ref{MakeCatalog} for
measurements.
+Ultimately both the input and Sky standard deviation datasets are redundant
(you had them before running Segment).
+When the inputs are large/numerous, these extra dataset can be a burden.
+@end table
+@cartouche
@noindent
-Taking the considerations above into account, we should clearly distinguish
the following concepts when talking about the standard deviation or error:
+@cindex Compression
+@strong{Save space:} with the @option{--rawoutput}, Segment's output will only
be two labeled datasets (only containing integers).
+Since they have no noise, such datasets can be compressed very effectively
(without any loss of data) with exceptionally high compression ratios.
+You can use the following command to compress it with the best ratio:
-@table @asis
-@item Standard deviation of population
-This is the standard deviation of the underlying distribution (10 in the
example above), and shown by @mymath{\sigma}.
-This is something you can never measure, and is just the ideal value.
+@cindex GNU Gzip
+@example
+$ gzip --best segment_output.fits
+@end example
-@item Standard deviation of mean
-Ideal error of measuring the mean (assuming we know @mymath{\sigma}).
+@noindent
+The resulting @file{.fits.gz} file can then be fed into any of Gnuastro's
programs directly, without having to decompress it separately (it will just
take them a little longer, because they have to decompress it internally before
use).
+@end cartouche
-@item Standard deviation of sample (i.e., @emph{Standard deviation})
-Measured Standard deviation from a sampling of the ideal distribution.
-This is the second column of @file{mean-stds.txt} above and is shown with
@mymath{\sigma_x} above.
-In astronomical literature, this is simply referred to as the ``standard
deviation''.
+When the input is a 2D image, to inspect NoiseChisel's output you can
configure SAO DS9 in your Graphic User Interface (GUI) to open NoiseChisel's
output as a multi-extension data cube.
+This will allow you to flip through the different extensions and visually
inspect the results.
+This process has been described for the GNOME GUI (most common GUI in
GNU/Linux operating systems) in @ref{Viewing FITS file contents with DS9 or
TOPCAT}.
-In other words, the standard deviation is computed on the input itself and
MakeCatalog just needs a ``values'' file.
-For example, when measuring the standard deviation of an astronomical object
using MakeCatalog it is computed directly from the input values.
-@item Standard error (i.e., @emph{error})
-Measurable scatter of measuring the mean (@mymath{\widehat\sigma_{\bar{x}}})
that can be estimated from the size of the sample and the measured standard
deviation (@mymath{\sigma_x}).
-In astronomical literature, this is simply referred to as the ``error''.
-In other words, when asking for an ``error'' measurement with MakeCatalog, a
separate standard deviation dataset should be always provided.
-This dataset should take into account all sources of scatter.
-For example, during the reduction of an image, the standard deviation dataset
should take into account the dispersion of each pixel that cames from the bias,
dark, flat fielding, etc.
-If this image is not available, it is possible to use the @code{SKY_STD}
extension from NoiseChisel as an estimation.
-For more see @ref{NoiseChisel output}.
-@end table
-@node Magnitude measurement error of each detection, Surface brightness error
of each detection, Standard deviation vs error, Quantifying measurement limits
-@subsubsection Magnitude measurement error of each detection
-The raw error in measuring the magnitude is only meaningful when the object's
magnitude is brighter than the upper-limit magnitude (see below).
-As discussed in @ref{Brightness flux magnitude}, the magnitude (@mymath{M}) of
an object with brightness @mymath{B} and zero point magnitude @mymath{z} can be
written as:
-@dispmath{M=-2.5\log_{10}(B)+z}
-@noindent
-Calculating the derivative with respect to @mymath{B}, we get:
-@dispmath{{dM\over dB} = {-2.5\over {B\times ln(10)}}}
-@noindent
-From the Tailor series (@mymath{\Delta{M}=dM/dB\times\Delta{B}}), we can write:
-@dispmath{\Delta{M} = \left|{-2.5\over ln(10)}\right|\times{\Delta{B}\over{B}}}
+@node MakeCatalog, Match, Segment, Data analysis
+@section MakeCatalog
+
+At the lowest level, a dataset (for example, an image) is just a collection of
values, placed after each other in any number of dimensions (for example, an
image is a 2D dataset).
+Each data-element (pixel) just has two properties: its position (relative to
the rest) and its value.
+In higher-level analysis, an entire dataset (an image for example) is rarely
treated as a singular entity@footnote{You can derive the over-all properties of
a complete dataset (1D table column, 2D image, or 3D data-cube) treated as a
single entity with Gnuastro's Statistics program (see @ref{Statistics}).}.
+You usually want to know/measure the properties of the (separate)
scientifically interesting targets that are embedded in it.
+For example, the magnitudes, positions and elliptical properties of the
galaxies that are in the image.
+
+MakeCatalog is Gnuastro's program for localized measurements over a dataset.
+In other words, MakeCatalog is Gnuastro's program to convert low-level
datasets (like images), to high level catalogs.
+The role of MakeCatalog in a scientific analysis and the benefits of its model
(where detection/segmentation is separated from measurement) is discussed in
@url{https://arxiv.org/abs/1611.06387v1, Akhlaghi [2016]}@footnote{A published
paper cannot undergo any more change, so this manual is the definitive guide.}
and summarized in @ref{Detection and catalog production}.
+We strongly recommend reading this short paper for a better understanding of
this methodology.
+Understanding the effective usage of MakeCatalog, will thus also help
effective use of other (lower-level) Gnuastro's programs like @ref{NoiseChisel}
or @ref{Segment}.
-@noindent
-But, @mymath{\Delta{B}/B} is just the inverse of the Signal-to-noise ratio
(@mymath{S/N}), so we can write the error in magnitude in terms of the
signal-to-noise ratio:
+It is important to define your regions of interest for measurements
@emph{before} running MakeCatalog.
+MakeCatalog is specialized in doing measurements accurately and efficiently.
+Therefore MakeCatalog will not do detection, segmentation, or defining
apertures on requested positions in your dataset.
+Following Gnuastro's modularity principle, there are separate and highly
specialized and customizable programs in Gnuastro for these other jobs as shown
below (for a usage example in a real-world analysis, see @ref{General program
usage tutorial} and @ref{Detecting large extended targets}).
-@dispmath{ \Delta{M} = {2.5\over{S/N\times ln(10)}} }
+@itemize
+@item
+@ref{Arithmetic}: Detection with a simple threshold.
-MakeCatalog uses this relation to estimate the magnitude errors.
-The signal-to-noise ratio is calculated in different ways for clumps and
objects (see @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
[2015]}), but this single equation can be used to estimate the measured
magnitude error afterwards for any type of target.
+@item
+@ref{NoiseChisel}: Advanced detection.
-@node Surface brightness error of each detection, Completeness limit of each
detection, Magnitude measurement error of each detection, Quantifying
measurement limits
-@subsubsection Surface brightness error of each detection
+@item
+@ref{Segment}: Segmentation (substructure over detections).
-@cindex Surface brightness error
-@cindex Error in surface brightness
-We can derive the error in measuring the surface brightness based on the
surface brightness (SB) equation of @ref{Brightness flux magnitude} and the
generic magnitude error (@mymath{\Delta{M}}) of @ref{Magnitude measurement
error of each detection}.
-Let's set @mymath{A} to represent the area and @mymath{\Delta{A}} to represent
the error in measuring the area.
-For more on @mymath{\Delta{A}}, see the description of
@option{--spatialresolution} in @ref{MakeCatalog inputs and basic settings}.
+@item
+@ref{MakeProfiles}: Aperture creation for known positions.
+@end itemize
-@dispmath{\Delta{(SB)} = \Delta{M} + \left|{-2.5\over
ln(10)}\right|\times{\Delta{A}\over{A}}}
+These programs will/can return labeled dataset(s) to be fed into MakeCatalog.
+A labeled dataset for measurement has the same size/dimensions as the input,
but with integer valued pixels that have the label/counter for each sub-set of
pixels that must be measured together.
+For example, all the pixels covering one galaxy in an image, get the same
label.
-In the surface brightness equation mentioned above, @mymath{A} is in units of
arcsecond squared and the conversion between arcseconds to pixels is a
multiplication factor.
-Therefore as long as @mymath{A} and @mymath{\Delta{A}} have the same units, it
does not matter if they are in arcseconds or pixels.
-Since the measure of spatial resolution (or area error) is the FWHM of the PSF
which is usually defined in terms of pixels, its more intuitive to use pixels
for @mymath{A} and @mymath{\Delta{A}}.
+The requested measurements are then done on similarly labeled pixels.
+The final result is a catalog where each row corresponds to the measurements
on pixels with a specific label.
+For example, the flux weighted average position of all the pixels with a label
of 42 will be written into the 42nd row of the output catalog/table's central
position column@footnote{See @ref{Measuring elliptical parameters} for a
discussion on this and the derivation of positional parameters, which includes
the center.}.
+Similarly, the sum of all these pixels will be the 42nd row in the sum column,
etc.
+Pixels with labels equal to, or smaller than, zero will be ignored by
MakeCatalog.
+In other words, the number of rows in MakeCatalog's output is already known
before running it (the maximum value of the labeled dataset).
-@node Completeness limit of each detection, Upper limit magnitude of each
detection, Surface brightness error of each detection, Quantifying measurement
limits
-@subsubsection Completeness limit of each detection
-@cindex Completeness
-As the surface brightness of the objects decreases, the ability to detect them
will also decrease.
-An important statistic is thus the fraction of objects of similar morphology
and magnitude that will be detected with our detection algorithm/parameters in
a given image.
-This fraction is known as @emph{completeness}.
-For brighter objects, completeness is 1: all bright objects that might exist
over the image will be detected.
-However, as we go to objects of lower overall surface brightness, we will fail
to detect a fraction of them, and fainter than a certain surface brightness
level (for each morphology),nothing will be detectable in the image: you will
need more data to construct a ``deeper'' image.
-For a given profile and dataset, the magnitude where the completeness drops
below a certain level (usually above @mymath{90\%}) is known as the
completeness limit.
+Before getting into the details of running MakeCatalog (in @ref{Invoking
astmkcatalog}, we will start with a discussion on the basics of its approach to
separating detection from measurements in @ref{Detection and catalog
production}.
+A very important factor in any measurement is understanding its validity
range, or limits.
+Therefore in @ref{Quantifying measurement limits}, we will discuss how to
estimate the reliability of the detection and basic measurements.
+This section will continue with a derivation of elliptical parameters from the
labeled datasets in @ref{Measuring elliptical parameters}.
+For those who feel MakeCatalog's existing measurements/columns are not enough
and would like to add further measurements, in @ref{Adding new columns to
MakeCatalog}, a checklist of steps is provided for readily adding your own new
measurements/columns.
-@cindex Purity
-@cindex False detections
-@cindex Detections false
-Another important parameter in measuring completeness is purity: the fraction
of true detections to all true detections.
-In effect purity is the measure of contamination by false detections: the
higher the purity, the lower the contamination.
-Completeness and purity are anti-correlated: if we can allow a large number of
false detections (that we might be able to remove by other means), we can
significantly increase the completeness limit.
+@menu
+* Detection and catalog production:: Discussing why/how to treat these
separately
+* Brightness flux magnitude:: More on Magnitudes, surface brightness, etc.
+* Quantifying measurement limits:: For comparing different catalogs.
+* Measuring elliptical parameters:: Estimating elliptical parameters.
+* Adding new columns to MakeCatalog:: How to add new columns.
+* MakeCatalog measurements:: List of all the measurements/columns by
MakeCatalog.
+* Invoking astmkcatalog:: Options and arguments to MakeCatalog.
+@end menu
-One traditional way to measure the completeness and purity of a given sample
is by embedding mock profiles in regions of the image with no detection.
-However in such a study we must be really careful to choose model profiles as
similar to the target of interest as possible.
+@node Detection and catalog production, Brightness flux magnitude,
MakeCatalog, MakeCatalog
+@subsection Detection and catalog production
+Most existing common tools in low-level astronomical data-analysis (for
example,
SExtractor@footnote{@url{https://www.astromatic.net/software/sextractor}})
merge the two processes of detection and measurement (catalog production) in
one program.
+However, in light of Gnuastro's modularized approach (modeled on the Unix
system) detection is separated from measurements and catalog production.
+This modularity is therefore new to many experienced astronomers and deserves
a short review here.
+Further discussion on the benefits of this methodology can be seen in
@url{https://arxiv.org/abs/1611.06387v1, Akhlaghi [2016]}.
+As discussed in the introduction of @ref{MakeCatalog}, detection (identifying
which pixels to do measurements on) can be done with different programs.
+Their outputs (a labeled dataset) can be directly fed into MakeCatalog to do
the measurements and write the result as a catalog/table.
+Beyond that, Gnuastro's modular approach has many benefits that will become
clear as you get more experienced in astronomical data analysis and want to be
more creative in using your valuable data for the exciting scientific project
you are working on.
+In short the reasons for this modularity can be classified as below:
-@node Upper limit magnitude of each detection, Magnitude limit of image,
Completeness limit of each detection, Quantifying measurement limits
-@subsubsection Upper limit magnitude of each detection
-Due to the noisy nature of data, it is possible to get arbitrarily faint
magnitudes, especially when you use labels from another image (for example see
@ref{Working with catalogs estimating colors}).
-Given the scatter caused by the dataset's noise, values fainter than a certain
level are meaningless: another similar depth observation will give a radically
different value.
-In such cases, measurements like the image magnitude limit are not useful
because it is estimated for a certain morphology and is given for the whole
image (it is a crude generalization; see see @ref{Magnitude limit of image}).
-You want a quality measure that is specific to each object.
+@itemize
-For example, assume that you have done your detection and segmentation on one
filter and now you do measurements over the same labeled regions, but on other
filters to measure colors (as we did in the tutorial @ref{Segmentation and
making a catalog}).
-Some objects are not going to have any significant signal in the other
filters, but for example, you measure magnitude of 36 for one of them!
-This is clearly unreliable (no dataset in current astronomy is able to detect
such a faint signal).
-In another image with the same depth, using the same filter, you might measure
a magnitude of 30 for it, and yet another might give you 33.
-Furthermore, the total sum of pixel values might actually be negative in some
images of the same depth (due to noise).
-In these cases, no magnitude can be defined and MakeCatalog will place a NaN
there (recall that a magnitude is a base-10 logarithm).
+@item
+Simplicity/robustness of independent, modular tools: making a catalog is a
logically separate process from labeling (detection, segmentation, or aperture
production).
+A user might want to do certain operations on the labeled regions before
creating a catalog for them.
+Another user might want the properties of the same pixels/objects in another
image (another filter for example) to measure the colors or SED fittings.
-@cindex Upper limit magnitude
-@cindex Magnitude, upper limit
-Using such unreliable measurements will directly affect our analysis, so we
must not use the raw measurements.
-When approaching the limits of your detection method, it is therefore
important to be able to identify such cases.
-But how can we know how reliable a measurement of one object on a given
dataset is?
+Here is an example of doing both: suppose you have images in various broad
band filters at various resolutions and orientations.
+The image of one color will thus not lie exactly on another or even be in the
same scale.
+However, it is imperative that the same pixels be used in measuring the colors
of galaxies.
-When we confront such unreasonably faint magnitudes, there is one thing we can
deduce: that if something actually exists under our labeled pixels (possibly
buried deep under the noise), it's inherent magnitude is fainter than an
@emph{upper limit magnitude}.
-To find this upper limit magnitude, we place the object's footprint
(segmentation map) over a random part of the image where there are no
detections, and measure the sum of pixel values within the footprint.
-Doing this a large number of times will give us a distribution of measurements
of the sum.
-The standard deviation (@mymath{\sigma}) of that distribution can be used to
quantify the upper limit magnitude for that particular object (given its
particular shape and area):
+To solve the problem, NoiseChisel can be run on the reference image to
generate the labeled detection image.
+Afterwards, the labeled image can be warped into the grid of the other color
(using @ref{Warp}).
+MakeCatalog will then generate the same catalog for both colors (with the
different labeled images).
+It is currently customary to warp the images to the same pixel grid, however,
modification of the scientific dataset is very harmful for the data and creates
correlated noise.
+It is much more accurate to do the transformations on the labeled image.
-@dispmath{M_{up,n\sigma}=-2.5\times\log_{10}{(n\sigma_m)}+z \quad\quad
[mag/target]}
+@item
+Complexity of a monolith: Adding in a catalog functionality to the detector
program will add several more steps (and many more options) to its processing
that can equally well be done outside of it.
+This makes following what the program does harder for the users and
developers, it can also potentially add many bugs.
-@cindex Correlated noise
-Traditionally, faint/small object photometry was done using fixed circular
apertures (for example, with a diameter of @mymath{N} arc-seconds) and there
was not much processing involved (to make a deep stack).
-Hence, the upper limit was synonymous with the surface brightness limit
discussed above: one value for the whole image.
-The problem with this simplified approach is that the number of pixels in the
aperture directly affects the final distribution and thus magnitude.
-Also the image correlated noise might actually create certain patterns, so the
shape of the object can also affect the final result.
-Fortunately, with the much more advanced hardware and software of today, we
can make customized segmentation maps (footprint) for each object and have
enough computing power to actually place that footprint over many random places.
-As a result, the per-target upper-limit magnitude and general surface
brightness limit have diverged.
+As an example, if the parameter you want to measure over one profile is not
provided by the developers of MakeCatalog.
+You can simply open this tiny little program and add your desired calculation
easily.
+This process is discussed in @ref{Adding new columns to MakeCatalog}.
+However, if making a catalog was part of NoiseChisel for example, adding a new
column/measurement would require a lot of energy to understand all the steps
and internal structures of that huge program.
+It might even be so intertwined with its processing, that adding new columns
might cause problems/bugs in its primary job (detection).
-When any of the upper-limit-related columns requested, MakeCatalog will
randomly place each target's footprint over the undetected parts of the dataset
as described above, and estimate the required properties.
-The procedure is fully configurable with the options in @ref{Upper-limit
settings}.
-You can get the full list of upper-limit related columns of MakeCatalog with
this command (the extra @code{--} before @code{--upperlimit} is
necessary@footnote{Without the extra @code{--}, grep will assume that
@option{--upperlimit} is one of its own options, and will thus abort,
complaining that it has no option with this name.}):
+@end itemize
-@example
-$ astmkcatalog --help | grep -- --upperlimit
-@end example
-@node Magnitude limit of image, Surface brightness limit of image, Upper limit
magnitude of each detection, Quantifying measurement limits
-@subsubsection Magnitude limit of image
-@cindex Magnitude limit
-Suppose we have taken two images of the same field of view with the same CCD,
once with a smaller telescope, and once with a larger one.
-Because we used the same CCD, the noise will be very similar.
-However, the larger telescope has gathered more light, therefore the same star
or galaxy will have a higher signal-to-noise ratio (S/N) in the image taken
with the larger one.
-The same applies for a stacked image of the field compared to a
single-exposure image of the same telescope.
-This concept is used by some researchers to define the ``magnitude limit'' or
``detection limit'' at a certain S/N (sometimes 10, 5 or 3 for example, also
written as @mymath{10\sigma}, @mymath{5\sigma} or @mymath{3\sigma}).
-To do this, they measure the magnitude and signal-to-noise ratio of all the
objects within an image and measure the mean (or median) magnitude of objects
at the desired S/N.
-A fully working example of deriving the magnitude limit is available in the
tutorials section: @ref{Measuring the dataset limits}.
-However, this method should be used with extreme care!
-This is because the shape of the object becomes important in this method: a
sharper object will have a higher @emph{measured} S/N compared to a more
diffuse object at the same original magnitude.
-Besides the inherent shape/sharpness of the object, issues like the PSF also
become important in this method (because the finally observed shapes of objects
are important here): two surveys with the same surface brightness limit (see
@ref{Surface brightness limit of image}) will have different magnitude limits
if one is taken from space and the other from the ground.
-@node Surface brightness limit of image, Upper limit magnitude of image,
Magnitude limit of image, Quantifying measurement limits
-@subsubsection Surface brightness limit of image
-@cindex Surface brightness
-As we make more observations on one region of the sky and add/combine the
observations into one dataset, both the signal and the noise increase.
-However, the signal increases much faster than the noise:
-Assuming you add @mymath{N} datasets with equal exposure times, the signal
will increases as a multiple of @mymath{N}, while noise increases as
@mymath{\sqrt{N}}.
-Therefore the signal-to-noise ratio increases by a factor of @mymath{\sqrt{N}}.
-Visually, fainter (per pixel) parts of the objects/signal in the image will
become more visible/detectable.
-The noise-level is known as the dataset's surface brightness limit.
-You can think of the noise as muddy water that is completely covering a flat
ground@footnote{The ground is the sky value in this analogy, see @ref{Sky
value}.
-Note that this analogy only holds for a flat sky value across the surface of
the image or ground.}.
-The signal (coming from astronomical objects in real data) will be
summits/hills that start from the flat sky level (under the muddy water) and
their summits can sometimes reach above the muddy water.
-Let's assume that in your first observation the muddy water has just been
stirred and except a few small peaks, you cannot see anything through the mud.
-As you wait and make more observations/exposures, the mud settles down and the
@emph{depth} of the transparent water increases.
-As a result, more and more summits become visible and the lower parts of the
hills (parts with lower surface brightness) can be seen more clearly.
-In this analogy@footnote{Note that this muddy water analogy is not perfect,
because while the water-level remains the same all over a peak, in data
analysis, the Poisson noise increases with the level of data.}, height (from
the ground) is the @emph{surface brightness} and the height of the muddy water
at the moment you combine your data, is your @emph{surface brightness limit}
for that moment.
-@cindex Data's depth
-The outputs of NoiseChisel include the Sky standard deviation
(@mymath{\sigma}) on every group of pixels (a tile) that were calculated from
the undetected pixels in each tile, see @ref{Tessellation} and @ref{NoiseChisel
output}.
-Let's take @mymath{\sigma_m} as the median @mymath{\sigma} over the successful
meshes in the image (prior to interpolation or smoothing).
-It is recorded in the @code{MEDSTD} keyword of the @code{SKY_STD} extension of
NoiseChisel's output.
-@cindex ACS camera
-@cindex Surface brightness limit
-@cindex Limit, surface brightness
-On different instruments, pixels cover different spatial angles over the sky.
-For example, the width of each pixel on the ACS camera on the Hubble Space
Telescope (HST) is roughly 0.05 seconds of arc, while the pixels of SDSS are
each 0.396 seconds of arc (almost eight times wider@footnote{Ground-based
instruments like the SDSS suffer from strong smoothing due to the atmosphere.
-Therefore, increasing the pixel resolution (or decreasing the width of a
pixel) will not increase the received information).}).
-Nevertheless, irrespective of its sky coverage, a pixel is our unit of data
collection.
-To start with, we define the low-level Surface brightness limit or
@emph{depth}, in units of magnitude/pixel with the equation below (assuming the
image has zero point magnitude @mymath{z} and we want the @mymath{n}th multiple
of @mymath{\sigma_m}).
+@node Brightness flux magnitude, Quantifying measurement limits, Detection and
catalog production, MakeCatalog
+@subsection Brightness, Flux, Magnitude and Surface brightness
-@dispmath{SB_{n\sigma,\rm pixel}=-2.5\times\log_{10}{(n\sigma_m)}+z \quad\quad
[mag/pixel]}
+@cindex ADU
+@cindex Gain
+@cindex Counts
+Astronomical data pixels are usually in units of counts@footnote{Counts are
also known as analog to digital units (ADU).} or electrons or either one
divided by seconds.
+To convert from the counts to electrons, you will need to know the instrument
gain.
+In any case, they can be directly converted to energy or energy/time using the
basic hardware (telescope, camera and filter) information (that is summarized
in the @emph{zero point}, and we will discuss below).
+We will continue the discussion assuming the pixels are in units of
energy/time.
-@cindex XDF survey
-@cindex CANDELS survey
-@cindex eXtreme Deep Field (XDF) survey
-As an example, the XDF survey covers part of the sky that the HST has observed
the most (for 85 orbits) and is consequently very small (@mymath{\sim4} minutes
of arc, squared).
-On the other hand, the CANDELS survey, is one of the widest multi-color
surveys done by the HST covering several fields (about 720 arcmin@mymath{^2})
but its deepest fields have only 9 orbits observation.
-The @mymath{1\sigma} depth of the XDF and CANDELS-deep surveys in the near
infrared WFC3/F160W filter are respectively 34.40 and 32.45 magnitudes/pixel.
-In a single orbit image, this same field has a @mymath{1\sigma} depth of 31.32
magnitudes/pixel.
-Recall that a larger magnitude corresponds to fainter objects, see
@ref{Brightness flux magnitude}.
+@table @asis
+@cindex Flux
+@cindex Luminosity
+@cindex Brightness
+@item Brightness
+The @emph{brightness} of an object is defined as its measured energy in units
of time.
+If our detector pixels directly measured the energy from the astronomical
object@footnote{In practice, the measured pixels don't just count the
astronomical object's energy: imaging detectors insert a certain bias level
before the exposure, they amplify the photo-electrons, there are optical
artifacts like flat-fielding, and finally, there is the background light.},
then the brightness would be the total sum of pixel values (energy) associated
to the object, divided by the exposure time.
+The @emph{flux} of an object is defined in units of
energy/time/collecting-area.
+For an astronomical target, the flux is therefore defined as its brightness
divided by the area used to collect the light from the source; or the telescope
aperture (for example, in units of @mymath{cm^2}).
+Knowing the flux (@mymath{f}) and distance to the object (@mymath{r}), we can
define its @emph{luminosity}: @mymath{L=4{\pi}r^2f}.
-@cindex Pixel scale
-The low-level magnitude/pixel measurement above is only useful when all the
datasets you want to use, or compare, have the same pixel size.
-However, you will often find yourself using, or comparing, datasets from
various instruments with different pixel scales (projected pixel width, in
arc-seconds).
-If we know the pixel scale, we can obtain a more easily comparable surface
brightness limit in units of: magnitude/arcsec@mymath{^2}.
-But another complication is that astronomical objects are usually larger than
1 arcsec@mymath{^2}.
-As a result, it is common to measure the surface brightness limit over a
larger (but fixed, depending on context) area.
+Therefore, while flux and luminosity are intrinsic properties of the object,
brightness depends on our detecting tools (hardware and software).
+In low-level observational astronomy data analysis, we are usually more
concerned with measuring the brightness, because it is the thing we directly
measure from the image pixels and create in catalogs.
+On the other hand, luminosity is used in higher-level analysis (after image
contents are measured as catalogs to deduce physical interpretations, because
high-level things like distance/redshift need to be calculated).
+At this stage, it is just important avoid confusion between luminosity and
brightness because both have the same units of energy per seconds.
-Let's assume that every pixel is @mymath{p} arcsec@mymath{^2} and we want the
surface brightness limit for an object covering A arcsec@mymath{^2} (so
@mymath{A/p} is the number of pixels that cover an area of @mymath{A}
arcsec@mymath{^2}).
-On the other hand, noise is added in RMS@footnote{If you add three datasets
with noise @mymath{\sigma_1}, @mymath{\sigma_2} and @mymath{\sigma_3}, the
resulting noise level is
@mymath{\sigma_t=\sqrt{\sigma_1^2+\sigma_2^2+\sigma_3^2}}, so when
@mymath{\sigma_1=\sigma_2=\sigma_3\equiv\sigma}, then
@mymath{\sigma_t=\sigma\sqrt{3}}.
-In this case, the area @mymath{A} is covered by @mymath{A/p} pixels, so the
noise level is @mymath{\sigma_t=\sigma\sqrt{A/p}}.}, hence the noise level in
@mymath{A} arcsec@mymath{^2} is @mymath{n\sigma_m\sqrt{A/p}}.
-But we want the result in units of arcsec@mymath{^2}, so we should divide this
by @mymath{A} arcsec@mymath{^2}:
-@mymath{n\sigma_m\sqrt{A/p}/A=n\sigma_m\sqrt{A/(pA^2)}=n\sigma_m/\sqrt{pA}}.
-Plugging this into the magnitude equation, we get the @mymath{n\sigma} surface
brightness limit, over an area of A arcsec@mymath{^2}, in units of
magnitudes/arcsec@mymath{^2}:
+@item Magnitude
+@cindex Magnitudes from flux
+@cindex Flux to magnitude conversion
+@cindex Astronomical Magnitude system
+Images of astronomical objects span over a very large range of brightness: the
Sun (as the brightest object) is roughly @mymath{2.5^{60}=10^{24}} times
brighter than the fainter galaxies we can currently detect in the deepest
images.
+Therefore discussing brightness directly will involve a large range of values
which is inconvenient.
+So astronomers have chosen to use a logarithmic scale for the brightness of
astronomical objects.
-@dispmath{SB_{{n\sigma,\rm A
arcsec}^2}=-2.5\times\log_{10}{\left(n\sigma_m\over \sqrt{pA}\right)+z}
\quad\quad [mag/arcsec^2]}
+@cindex Hipparchus of Nicaea
+But the logarithm can only be usable with a dimensionless value that is always
positive.
+Fortunately brightness is always positive (at least in theory@footnote{In
practice, for very faint objects, if the background brightness is
over-subtracted, we may end up with a negative ``brightness'' or sum of pixels
in a real object.}).
+To remove the dimensions, we divide the brightness of the object (@mymath{B})
by a reference brightness (@mymath{B_r}).
+We then define a logarithmic scale as @mymath{magnitude} through the relation
below.
+The @mymath{-2.5} factor in the definition of magnitudes is a legacy of the
our ancient colleagues and in particular Hipparchus of Nicaea (190-120 BC).
-@cindex World Coordinate System (WCS)
-MakeCatalog will calculate the input dataset's @mymath{SB_{n\sigma,\rm pixel}}
and @mymath{SB_{{n\sigma,\rm A arcsec}^2}} and will write them as the
@code{SBLMAGPIX} and @code{SBLMAG} keywords the output catalog(s), see
@ref{MakeCatalog output}.
-You can set your desired @mymath{n}-th multiple of @mymath{\sigma} and the
@mymath{A} arcsec@mymath{^2} area using the following two options respectively:
@option{--sfmagnsigma} and @option{--sfmagarea} (see @ref{MakeCatalog output}).
-Just note that @mymath{SB_{{n\sigma,\rm A arcsec}^2}} is only calculated if
the input has World Coordinate System (WCS).
-Without WCS, the pixel scale cannot be derived.
+@dispmath{m-m_r=-2.5\log_{10} \left( B \over B_r \right)}
-@cindex Correlated noise
-@cindex Noise, correlated
-As you saw in its derivation, the calculation above extrapolates the noise in
one pixel over all the input's pixels!
-It therefore implicitly assumes that the noise is the same in all of the
pixels.
-But this only happens in individual exposures: reduced data will have
correlated noise because they are a stack of many individual exposures that
have been warped (thus mixing the pixel values).
-A more accurate measure which will provide a realistic value for every labeled
region is known as the @emph{upper-limit magnitude}, which is discussed below.
+@noindent
+@mymath{m} is defined as the magnitude of the object and @mymath{m_r} is the
pre-defined magnitude of the reference brightness.
+For estimating the error in measuring a magnitude, see @ref{Quantifying
measurement limits}.
+@item Zero point
+@cindex Zero point magnitude
+@cindex Magnitude zero point
+A unique situation in the magnitude equation above occurs when the reference
brightness is unity (@mymath{B_r=1}).
+This brightness will thus summarize all the hardware-specific parameters
discussed above (like the conversion of pixel values to physical units) into
one number.
+That reference magnitude is commonly known as the @emph{Zero point} magnitude
because when @mymath{B=B_r=1}, the right side of the magnitude definition above
will be zero.
+Using the zero point magnitude (@mymath{Z}), we can write the magnitude
relation above in a more simpler format:
-@node Upper limit magnitude of image, , Surface brightness limit of image,
Quantifying measurement limits
-@subsubsection Upper limit magnitude of image
-As mentioned in @ref{Upper limit magnitude of each detection}, the upper-limit
magnitude will depend on the shape of each object's footprint.
-Therefore we can measure a dataset's upper-limit magnitude using standard
shapes.
+@dispmath{m = -2.5\log_{10}(B) + Z}
-Traditionally a circular aperture of a fixed size (in arcseconds) has been
used.
-For a full example of implementing this, see the respective section in the
tutorial (@ref{Image surface brightness limit}).
+@cindex Janskys (Jy)
+@cindex AB magnitude
+@cindex Magnitude, AB
+Gnuastro has an installed script to estimate the zero point of any image, see
@ref{Zero point estimation} (it contains practical tutorials to help you get
started fast).
+Having the zero point of an image, you can convert its pixel values to
physical units like microJanskys (or @mymath{\mu{}Jy}).
+This enables direct pixel-based comparisons with images from other
instruments@footnote{Comparing data from different instruments assumes
instrument and observation signatures are properly corrected, things like the
flat-field or the Sky absorption.
+It is also valid for pixel values, assuming that factors that can change the
morphology (like the @ref{PSF}) are the same.}.
+Jansky is a commonly used unit for measuring spectral flux density and one
Jansky is equivalent to @mymath{10^{-26} W/m^2/Hz} (watts per square meter per
hertz).
+This conversion can be done with the fact that in the AB magnitude
standard@footnote{@url{https://en.wikipedia.org/wiki/AB_magnitude}},
@mymath{3631Jy} corresponds to the zero-th magnitude, therefore
@mymath{B\equiv3631\times10^{6}\mu{Jy}} and @mymath{m\equiv0}.
+We can therefore estimate the brightness (@mymath{B_z}, in @mymath{\mu{Jy}})
corresponding to the image zero point (@mymath{Z}) using this equation:
+@dispmath{m - Z = -2.5\log_{10}(B/B_z)}
+@dispmath{0 - Z = -2.5\log_{10}({3631\times10^{6}\over B_z})}
+@dispmath{B_z = 3631\times10^{\left(6 - {Z \over 2.5} \right)} \mu{Jy}}
+@cindex SDSS
+Because the image zero point corresponds to a pixel value of @mymath{1}, the
@mymath{B_z} value calculated above also corresponds to a pixel value of
@mymath{1}.
+Therefore you simply have to multiply your image by @mymath{B_z} to convert it
to @mymath{\mu{Jy}}.
+Do Not forget that this only applies when your zero point was also estimated
in the AB magnitude system.
+On the command-line, you can estimate this value for a certain zero point with
AWK, then multiply it to all the pixels in the image with @ref{Arithmetic}.
+For example, let's assume you are using an SDSS image with a zero point of
22.5:
+@example
+bz=$(echo 22.5 | awk '@{print 3631 * 10^(6-$1/2.5)@}')
+astarithmetic sdss.fits $bz x --output=sdss-in-muJy.fits
+@end example
+@noindent
+But in Gnuastro, it gets even easier: Arithmetic has an operator called
@code{counts-to-jy}.
+This will directly convert your image pixels (in units of counts) to Janskys
though a provided AB Magnitude-based zero point like below.
+See @ref{Arithmetic operators} for more.
+@example
+$ astarithmetic sdss.fits 22.5 counts-to-jy
+@end example
+@cartouche
+@noindent
+@strong{Be careful with the exposure time:} as described at the start of this
section, we are assuming your data are in units of counts/sec.
+As a result, the counts you get from the command above, are only for one
second of exposure!
+Please see the discussion below in ``Magnitude to counts'' for more.
+@end cartouche
+@item Magnitude to counts (accounting for exposure time)
+@cindex Exposure time
+Until now, we had assumed that the data are in units of counts/sec.
+As a result, the equations given above (in the ``Zero point'' item to convert
magnitudes to pixel counts), give the count level for the reference (1 second)
exposure.
+But we rarely take 1 second exposures!
+It is therefore very important to take the exposure time into account in
scenarios like simulating observations with varying exposure times (where you
need to know how many counts the object of a certain magnitude will add to a
certain image with a certain exposure time).
+To clarify the concept, let's define @mymath{C} as the @emph{counted}
electrons (which has a linear relation with the photon energy entering the CCD
pixel).
+In this case, if an object of brightness @mymath{B} is observed for @mymath{t}
seconds, it will accumulate @mymath{C=B\times t} counts@footnote{Recall that
counts another name for ADUs, which already includes the CCD gain.}.
+Therefore, the generic magnitude equation above can be written as:
+@dispmath{m = -2.5\log_{10}(B) + Z = -2.5\log_{10}(C/t) + Z}
+@noindent
+From this, we can derive @mymath{C(t)} in relation to @mymath{C(1)}, or counts
from a 1 second exposure, using this relation:
+@dispmath{C(t) = t\times10^{(m-Z)/2.5} = t\times C(1)}
+In other words, you should simply multiply the counts for one second with the
number of observed seconds.
-@node Measuring elliptical parameters, Adding new columns to MakeCatalog,
Quantifying measurement limits, MakeCatalog
-@subsection Measuring elliptical parameters
+Another approach is to shift the time-dependence of the counts into the zero
point (after all exposure time is also a hardware issue).
+Let's derive the equation below:
+@dispmath{m = -2.5\log_{10}(C/t) + Z = -2.5\log_{10}(C) + 2.5\log_{10}(t) + Z}
+Therefore, defining an exposure-time-dependent zero point as @mymath{Z(t)}, we
can directly correlate a certain object's magnitude with counts after an
exposure of @mymath{t} seconds:
+@dispmath{m = -2.5\log_{10}(C) + Z(t) \quad\rm{where}\quad Z(t)=Z +
2.5\log_{10}(t)}
+This solution is useful in programs like @ref{MakeCatalog} or
@ref{MakeProfiles}, when you cannot (or do not want to: because of the extra
storage/speed costs) manipulate the values image (for example, divide it by the
exposure time to use a counts/sec zero point).
-The shape or morphology of a target is one of the most commonly desired
parameters of a target.
-Here, we will review the derivation of the most basic/simple morphological
parameters: the elliptical parameters for a set of labeled pixels.
-The elliptical parameters are: the (semi-)major axis, the (semi-)minor axis
and the position angle along with the central position of the profile.
-The derivations below follow the SExtractor manual derivations with some added
explanations for easier reading.
+@item Surface brightness
+@cindex Steradian
+@cindex Angular coverage
+@cindex Celestial sphere
+@cindex Surface brightness
+@cindex SI (International System of Units)
+Another important concept is the distribution of an object's brightness over
its area.
+For this, we define the @emph{surface brightness} to be the magnitude of an
object's brightness divided by its solid angle over the celestial sphere (or
coverage in the sky, commonly in units of arcsec@mymath{^2}).
+The solid angle is expressed in units of arcsec@mymath{^2} because
astronomical targets are usually much smaller than one steradian.
+Recall that the steradian is the dimension-less SI unit of a solid angle and 1
steradian covers @mymath{1/4\pi} (almost @mymath{8\%}) of the full celestial
sphere.
-@cindex Moments
-Let's begin with one dimension for simplicity: Assume we have a set of
@mymath{N} values @mymath{B_i} (for example, showing the spatial distribution
of a target's brightness), each at position @mymath{x_i}.
-The simplest parameter we can define is the geometric center of the object
(@mymath{x_g}) (ignoring the brightness values): @mymath{x_g=(\sum_ix_i)/N}.
-@emph{Moments} are defined to incorporate both the value (brightness) and
position of the data.
-The first moment can be written as:
+Surface brightness is therefore most commonly expressed in units of
mag/arcsec@mymath{^2}.
+For example, when the brightness is measured over an area of A
arcsec@mymath{^2}, then the surface brightness becomes:
-@dispmath{\overline{x}={\sum_iB_ix_i \over \sum_iB_i}}
+@dispmath{S = -2.5\log_{10}(B/A) + Z = -2.5\log_{10}(B) + 2.5\log_{10}(A) + Z}
-@cindex Variance
-@cindex Second moment
@noindent
-This is essentially the weighted (by @mymath{B_i}) mean position.
-The geometric center (@mymath{x_g}, defined above) is a special case of this
with all @mymath{B_i=1}.
-The second moment is essentially the variance of the distribution:
+In other words, the surface brightness (in units of mag/arcsec@mymath{^2}) is
related to the object's magnitude (@mymath{m}) and area (@mymath{A}, in units
of arcsec@mymath{^2}) through this equation:
-@dispmath{\overline{x^2}\equiv{\sum_iB_i(x_i-\overline{x})^2 \over
- \sum_iB_i} = {\sum_iB_ix_i^2 \over \sum_iB_i} -
- 2\overline{x}{\sum_iB_ix_i\over\sum_iB_i} + \overline{x}^2
- ={\sum_iB_ix_i^2 \over \sum_iB_i} - \overline{x}^2}
+@dispmath{S = m + 2.5\log_{10}(A)}
-@cindex Standard deviation
-@noindent
-The last step was done from the definition of @mymath{\overline{x}}.
-Hence, the square root of @mymath{\overline{x^2}} is the spatial standard
deviation (along the one-dimension) of this particular brightness distribution
(@mymath{B_i}).
-Crudely (or qualitatively), you can think of its square root as the distance
(from @mymath{\overline{x}}) which contains a specific amount of the flux
(depending on the @mymath{B_i} distribution).
-Similar to the first moment, the geometric second moment can be found by
setting all @mymath{B_i=1}.
-So while the first moment quantified the position of the brightness
distribution, the second moment quantifies how that brightness is dispersed
about the first moment.
-In other words, it quantifies how ``sharp'' the object's image is.
+A common mistake is to follow the mag/arcsec@mymath{^2} unit literally, and
divide the object's magnitude by its area.
+But this is wrong because magnitude is a logarithmic scale while area is
linear.
+It is the brightness that should be divided by the solid angle because both
have linear scales.
+The magnitude of that ratio is then defined to be the surface brightness.
-@cindex Floating point error
-Before continuing to two dimensions and the derivation of the elliptical
parameters, let's pause for an important implementation technicality.
-You can ignore this paragraph and the next two if you do not want to implement
these concepts.
-The basic definition (first definition of @mymath{\overline{x^2}} above) can
be used without any major problem.
-However, using this fraction requires two runs over the data: one run to find
@mymath{\overline{x}} and another run to find @mymath{\overline{x^2}} from
@mymath{\overline{x}}, this can be slow.
-The advantage of the last fraction above, is that we can estimate both the
first and second moments in one run (since the @mymath{-\overline{x}^2} term
can easily be added later).
+One usual application of this is to convert an image's pixel values to surface
brightness, when you know its zero point.
+This can be done with the two simple commands below.
+First, we derive the pixel area (in arcsec@mymath{^2}) then we use Arithmetic
to convert the pixels into surface brightness, see below for the details.
-The logarithmic nature of floating point number digitization creates a
complication however: suppose the object is located between pixels 10000 and
10020.
-Hence the target's pixels are only distributed over 20 pixels (with a standard
deviation @mymath{<20}), while the mean has a value of @mymath{\sim10000}.
-The @mymath{\sum_iB_i^2x_i^2} will go to very very large values while the
individual pixel differences will be orders of magnitude smaller.
-This will lower the accuracy of our calculation due to the limited accuracy of
floating point operations.
-The variance only depends on the distance of each point from the mean, so we
can shift all position by a constant/arbitrary @mymath{K} which is much closer
to the mean: @mymath{\overline{x-K}=\overline{x}-K}.
-Hence we can calculate the second order moment using:
+@example
+$ zeropoint=22.5
+$ pixarea=$(astfits image.fits --pixelareaarcsec2)
+$ astarithmetic image.fits $zeropoint $pixarea counts-to-sb \
+ --output=image-sb.fits
+@end example
-@dispmath{ \overline{x^2}={\sum_iB_i(x_i-K)^2 \over \sum_iB_i} -
- (\overline{x}-K)^2 }
+See @ref{Reverse polish notation} for more on Arithmetic's notation and
@ref{Arithmetic operators} for a description of each operator.
+And see @ref{FITS images in a publication} for a fully working tutorial on how
to optimally convert a FITS image to a PDF image for usage in a publication
using the surface brightness conversion shown above.
+@cartouche
@noindent
-The closer @mymath{K} is to @mymath{\overline{x}}, the better (the sums of
squares will involve smaller numbers), as long as @mymath{K} is within the
object limits (in the example above: @mymath{10000\leq{K}\leq10020}), the
floating point error induced in our calculation will be negligible.
-For the most simplest implementation, MakeCatalog takes @mymath{K} to be the
smallest position of the object in each dimension.
-Since @mymath{K} is arbitrary and an implementation/technical detail, we will
ignore it for the remainder of this discussion.
+@strong{Do Not warp or convolve magnitude or surface brightness images:}
Warping an image involves calculating new pixel values (of the new pixel grid)
from the old pixel values.
+Convolution is also a process of finding the weighted mean of pixel values.
+During these processes, many arithmetic operations are done on the original
pixel values, for example, addition or multiplication.
+However, @mymath{log_{10}(a+b)\ne log_{10}(a)+log_{10}(b)}.
+Therefore after calculating a magnitude or surface brightness image, do not
apply any such operations on it!
+If you need to warp or convolve the image, do it @emph{before} the conversion.
+@end cartouche
+@end table
-In two dimensions, the mean and variances can be written as:
-@dispmath{\overline{x}={\sum_iB_ix_i\over B_i}, \quad
- \overline{x^2}={\sum_iB_ix_i^2 \over \sum_iB_i} -
- \overline{x}^2}
-@dispmath{\overline{y}={\sum_iB_iy_i\over B_i}, \quad
- \overline{y^2}={\sum_iB_iy_i^2 \over \sum_iB_i} -
- \overline{y}^2}
-@dispmath{\quad\quad\quad\quad\quad\quad\quad\quad\quad
- \overline{xy}={\sum_iB_ix_iy_i \over \sum_iB_i} -
- \overline{x}\times\overline{y}}
-If an elliptical profile's major axis exactly lies along the @mymath{x} axis,
then @mymath{\overline{x^2}} will be directly proportional with the profile's
major axis, @mymath{\overline{y^2}} with its minor axis and
@mymath{\overline{xy}=0}.
-However, in reality we are not that lucky and (assuming galaxies can be
parameterized as an ellipse) the major axis of galaxies can be in any direction
on the image (in fact this is one of the core principles behind weak-lensing by
shear estimation).
-So the purpose of the remainder of this section is to define a strategy to
measure the position angle and axis ratio of some randomly positioned ellipses
in an image, using the raw second moments that we have calculated above in our
image coordinates.
-Let's assume we have rotated the galaxy by @mymath{\theta}, the new second
order moments are:
-@dispmath{\overline{x_\theta^2} = \overline{x^2}\cos^2\theta +
- \overline{y^2}\sin^2\theta -
- 2\overline{xy}\cos\theta\sin\theta }
-@dispmath{\overline{y_\theta^2} = \overline{x^2}\sin^2\theta +
- \overline{y^2}\cos^2\theta +
- 2\overline{xy}\cos\theta\sin\theta}
-@dispmath{\overline{xy_\theta} = \overline{x^2}\cos\theta\sin\theta -
- \overline{y^2}\cos\theta\sin\theta +
- \overline{xy}(\cos^2\theta-\sin^2\theta)}
+@node Quantifying measurement limits, Measuring elliptical parameters,
Brightness flux magnitude, MakeCatalog
+@subsection Quantifying measurement limits
-@noindent
-The best @mymath{\theta} (@mymath{\theta_0}, where major axis lies along the
@mymath{x_\theta} axis) can be found by:
+@cindex Depth of data
+@cindex Clump magnitude limit
+@cindex Object magnitude limit
+@cindex Limit, object/clump magnitude
+@cindex Magnitude, object/clump detection limit
+No measurement on a real dataset can be perfect: you can only reach a certain
level/limit of accuracy and a meaningful (scientific) analysis requires an
understanding of these limits.
+Different datasets have different noise properties and different detection
methods (one method/algorithm/software that is run with a different set of
parameters is considered as a different detection method) will have different
abilities to detect or measure certain kinds of signal (astronomical objects)
and their properties in the dataset.
+Hence, quantifying the detection and measurement limitations with a particular
dataset and analysis tool is the most crucial/critical aspect of any high-level
analysis.
+In two separate tutorials, we have touched upon some of these points.
+So to see the discussions below in action (on real data), see @ref{Measuring
the dataset limits} and @ref{Image surface brightness limit}.
-@dispmath{\left.{\partial \overline{x_\theta^2} \over \partial
\theta}\right|_{\theta_0}=0}
-Taking the derivative, we get:
-@dispmath{2\cos\theta_0\sin\theta_0(\overline{y^2}-\overline{x^2}) +
-2(\cos^2\theta_0-\sin^2\theta_0)\overline{xy}=0} When
-@mymath{\overline{x^2}\neq\overline{y^2}}, we can write:
-@dispmath{\tan2\theta_0 =
-2{\overline{xy} \over \overline{x^2}-\overline{y^2}}.}
+Here, we will review some of the most commonly used methods to quantify the
limits in astronomical data analysis and how MakeCatalog makes it easy to
measure them.
+Depending on the higher-level analysis, there are more tests that must be
done, but these are relatively low-level and usually necessary in most cases.
+In astronomy, it is common to use the magnitude (a unit-less scale) and
physical units, see @ref{Brightness flux magnitude}.
+Therefore the measurements discussed here are commonly used in units of
magnitudes.
-@cindex Position angle
-@noindent
-MakeCatalog uses the standard C math library's @code{atan2} function to
estimate @mymath{\theta_0}, which we define as the position angle of the
ellipse.
-To recall, this is the angle of the major axis of the ellipse with the
@mymath{x} axis.
-By definition, when the elliptical profile is rotated by @mymath{\theta_0},
then @mymath{\overline{xy_{\theta_0}}=0}, @mymath{\overline{x_{\theta_0}^2}}
will be the extent of the maximum variance and
@mymath{\overline{y_{\theta_0}^2}} the extent of the minimum variance (which
are perpendicular for an ellipse).
-Replacing @mymath{\theta_0} in the equations above for
@mymath{\overline{x_\theta}} and @mymath{\overline{y_\theta}}, we can get the
semi-major (@mymath{A}) and semi-minor (@mymath{B}) lengths:
+@menu
+* Standard deviation vs error:: The std is not a measure of the error.
+* Magnitude measurement error of each detection:: Error in measuring
magnitude.
+* Surface brightness error of each detection:: Error in measuring the Surface
brightness.
+* Completeness limit of each detection:: Possibility of detecting similar
objects?
+* Upper limit magnitude of each detection:: How reliable is your magnitude?
+* Magnitude limit of image:: Measured magnitude of objects at certain S/N.
+* Surface brightness limit of image:: Extrapolate per-pixel noise-level to
standard units.
+* Upper limit magnitude of image:: Measure the noise-level for a certain
aperture.
+@end menu
-@dispmath{A^2\equiv\overline{x_{\theta_0}^2}= {\overline{x^2} +
-\overline{y^2} \over 2} + \sqrt{\left({\overline{x^2}-\overline{y^2} \over
2}\right)^2 + \overline{xy}^2}}
+@node Standard deviation vs error, Magnitude measurement error of each
detection, Quantifying measurement limits, Quantifying measurement limits
+@subsubsection Standard deviation vs error
+The error and the standard deviation are sometimes confused with each other.
+Therefore, before continuing with the various measurement limits below, let's
review these two fundamental concepts.
+Instead of going into the theoretical defitions of the two (which you can see
in their resepctive Wikipedia pages), we'll discuss the concepts in a hands-on
and practical way here.
-@dispmath{B^2\equiv\overline{y_{\theta_0}^2}= {\overline{x^2} +
-\overline{y^2} \over 2} - \sqrt{\left({\overline{x^2}-\overline{y^2} \over
2}\right)^2 + \overline{xy}^2}}
+Let's simulate an observation of the sky, but without any astronomical sources!
+In other words, where we only a background flux level (from the sky emission).
+With the first command below, let's make an image called @file{1.fits} that
contains @mymath{200\times200} pixels that are filled with random noise from a
Poisson distribution with a mean of 100 counts (the flux from the background
sky).
+Recall that the Poisson distribution is equal to a normal distribution for
larger mean values (as in this case).
-As a summary, it is important to remember that the units of @mymath{A} and
@mymath{B} are in pixels (the standard deviation of a positional distribution)
and that they represent the spatial light distribution of the object in both
image dimensions (rotated by @mymath{\theta_0}).
-When the object cannot be represented as an ellipse, this interpretation
breaks down: @mymath{\overline{xy_{\theta_0}}\neq0} and
@mymath{\overline{y_{\theta_0}^2}} will not be the direction of minimum
variance.
+The standard deviation (@mymath{\sigma}) of the Poisson distribution is the
square root of the mean, see @ref{Photon counting noise}.
+With the second command, we'll have a look at the image.
+Note that due to the random nature of the noise, the values reported in the
next steps on your computer will be very slightly different.
+To reproducible exactly the same values in different runs, see @ref{Generating
random numbers}, and for more on the first command, see @ref{Arithmetic}.
+
+@example
+$ astarithmetic 200 200 2 makenew 100 mknoise-poisson \
+ --output=1.fits
+
+$ astscript-fits-view 1.fits
+@end example
+Each pixel shows the result of one sampling from the Poisson distribution.
+In other words, assuming the sky emission in our simulation is constant over
our field of view, each pixel's value shows one measurement of the sky emission.
+Statistically speaking, a ``measurement'' is a sampling from an underlying
distribution of values.
+Through our measurements, we aim to identfy that underlying distribution (the
``truth'')!
+With the command below, let's look at the pixel statistics of @file{1.fits}
(output is shown immediately under it).
+@c If you change this output, replace the standard deviation (10.09) below
+@c in the text.
+@example
+$ aststatistics 1.fits
+Statistics (GNU Astronomy Utilities) @value{VERSION}
+-------
+Input: 1.fits (hdu: 1)
+-------
+ Number of elements: 40000
+ Minimum: -4.72824245470431e+01
+ Maximum: 4.24861780263050e+01
+ Mode: 0.09274776246
+ Mode quantile: 0.5004125103
+ Median: 8.36190404450713e-02
+ Mean: 0.098637593
+ Standard deviation: 10.09065298
+-------
+Histogram:
+ | * ****
+ | *********
+ | ************
+ | **************
+ | *****************
+ | ********************
+ | ***********************
+ | **************************
+ | ******************************
+ | **************************************
+ |* * *********************************************************** * *
+ |----------------------------------------------------------------------
+@end example
+As expected, you see that the ASCII histogram nicely resembles a normal
distribution.
+The measured mean and standard deviation (@mymath{\sigma_x}) are also very
similar to the input (mean of 100, standard deviation of @mymath{\sigma=10}).
+But the measured mean (and standard deviation) aren't exactly equal to the
input!
+Every time we make a different simulated image from the same distribution, the
measured mean and standrad deviation will slightly differ.
+With the second command below, let's build 500 images like above and measure
their mean and standard deviation.
+The outputs will be written into a file (@file{mean-stds.txt}; in the first
command we are deleting it to make sure we write into an empty file within the
loop).
+With the third command, let's view the top 10 rows:
+@example
+$ rm -f mean-stds.txt
+$ for i in $(seq 500); do \
+ astarithmetic 200 200 2 makenew 100 mknoise-poisson \
+ --output=$i.fits --quiet; \
+ aststatistics $i.fits --mean --std >> mean-stds.txt; \
+ echo "$i: complete"; \
+ done
-@node Adding new columns to MakeCatalog, MakeCatalog measurements, Measuring
elliptical parameters, MakeCatalog
-@subsection Adding new columns to MakeCatalog
+$ asttable mean-stds.txt -Y --head=10
+99.989381 9.936407
+100.036622 10.059997
+100.006054 9.985470
+99.944535 9.960069
+100.050318 9.970116
+100.002718 9.905395
+100.067555 9.964038
+100.027167 10.018562
+100.051951 9.995859
+100.000212 9.970293
+@end example
-MakeCatalog is designed to allow easy addition of different measurements over
a labeled image (see @url{https://arxiv.org/abs/1611.06387v1, Akhlaghi [2016]}).
-A check-list style description of necessary steps to do that is described in
this section.
-The common development characteristics of MakeCatalog and other Gnuastro
programs is explained in @ref{Developing}.
-We strongly encourage you to have a look at that chapter to greatly simplify
your navigation in the code.
-After adding and testing your column, you are most welcome (and encouraged) to
share it with us so we can add to the next release of Gnuastro for everyone
else to also benefit from your efforts.
+From this table, you see that each simulation has produced a slightly
different measured mean and measured standard deviation (@mymath{\sigma_x})
that are just fluctuating around the input mean (which was 100) and input
standard deviation (@mymath{\sigma=10}).
+Let's have a look at the distribution of mean measurements:
-MakeCatalog will first pass over each label's pixels two times and do
necessary raw/internal calculations.
-Once the passes are done, it will use the raw information for filling the
final catalog's columns.
-In the first pass it will gather mainly object information and in the second
run, it will mainly focus on the clumps, or any other measurement that needs an
output from the first pass.
-These two passes are designed to be raw summations: no extra processing.
-This will allow parallel processing and simplicity/clarity.
-So if your new calculation, needs new raw information from the pixels, then
you will need to also modify the respective @code{mkcatalog_first_pass} and
@code{mkcatalog_second_pass} functions (both in
@file{bin/mkcatalog/mkcatalog.c}) and define new raw table columns in
@file{main.h} (hopefully the comments in the code are clear enough).
+@example
+$ aststatistics mean-stds.txt -c1
+Statistics (GNU Astronomy Utilities) @value{VERSION}
+-------
+Input: mean-stds.txt
+Column: 1
+-------
+ Number of elements: 500
+ Minimum: 9.98183528700191e+01
+ Maximum: 1.00146490891332e+02
+ Mode: 99.99709739
+ Mode quantile: 0.49498998
+ Median: 9.99977393190436e+01
+ Mean: 99.99891826
+ Standard deviation: 0.04901635275
+-------
+Histogram:
+ | *
+ | * **
+ | ****** **** * *
+ | ****** **** * * *
+ | * * ************* * *
+ | * ****************** **
+ | * ********************* *** *
+ | * ***************************** ***
+ | *** ********************************** *
+ | *** ******************************************* **
+ | * ************************************************* ** *
+ |----------------------------------------------------------------------
+@end example
-In all these different places, the final columns are sorted in the same order
(same order as @ref{Invoking astmkcatalog}).
-This allows a particular column/option to be easily found in all steps.
-Therefore in adding your new option, be sure to keep it in the same relative
place in the list in all the separate places (it does not necessarily have to
be in the end), and near conceptually similar options.
+@cindex Standard error of mean
+The standard deviation of the various mean measurements above shows the
scatter in measuring the mean with an image of this size from this underlying
distribution.
+This is therefore defined as the @emph{standard error of the mean}, or
``error'' for short (since most measurements are actually the mean of a
population) and shown with @mymath{\widehat\sigma_{\bar{x}}}.
-@table @file
+From the example above, you see that the error is smaller than the standard
deviation (smaller when you have a larger sample).
+In fact, @url{https://en.wikipedia.org/wiki/Standard_error#Derivation, it can
be shown} that this ``error of the mean'' (@mymath{\sigma_{\bar{x}}}) is
related to the distribution standard deviation (@mymath{\sigma}) through the
following equation.
+Where @mymath{N} is the number of points used to measure the mean in one
sample (@mymath{200\times200=40000} in this case).
+Note that the @mymath{10.09} below was reported as ``standard deviation'' in
the first run of @code{aststatistics} on @file{1.fits} above):
-@item main.h
-The @code{objectcols} and @code{clumpcols} enumerated variables (@code{enum})
define the raw/internal calculation columns.
-If your new column requires new raw calculations, add a row to the respective
list.
-If your calculation requires any other settings parameters, you should add a
variable to the @code{mkcatalogparams} structure.
+@c The 10.09 depends on the 'aststatistics 1.fits' command above.
+@dispmath{\sigma_{\bar{x}}=\frac{\sigma}{\sqrt{N}} \quad\quad {\rm or}
\quad\quad \widehat\sigma_{\bar{x}}\approx\frac{\sigma_x}{\sqrt{N}} =
\frac{10.09}{200} = 0.05}
-@item ui.c
-If the new column needs raw calculations (an entry was added in
@code{objectcols} and @code{clumpcols}), specify which inputs it needs in
@code{ui_necessary_inputs}, similar to the other options.
-Afterwards, if your column includes any particular settings (you needed to add
a variable to the @code{mkcatalogparams} structure in @file{main.h}), you
should do the sanity checks and preparations for it here.
+@noindent
+Taking the considerations above into account, we should clearly distinguish
the following concepts when talking about the standard deviation or error:
-@item ui.h
-The @code{option_keys_enum} associates a unique value for each option to
MakeCatalog.
-The options that have a short option version, the single character short
comment is used for the value.
-Those that do not have a short option version, get a large integer
automatically.
-You should add a variable here to identify your desired column.
+@table @asis
+@item Standard deviation of population
+This is the standard deviation of the underlying distribution (10 in the
example above), and shown by @mymath{\sigma}.
+This is something you can never measure, and is just the ideal value.
+@item Standard deviation of mean
+Ideal error of measuring the mean (assuming we know @mymath{\sigma}).
-@cindex GNU C library
-@item args.h
-This file specifies all the parameters for the GNU C library, Argp structure
that is in charge of reading the user's options.
-To define your new column, just copy an existing set of parameters and change
the first, second and 5th values (the only ones that differ between all the
columns), you should use the macro you defined in @file{ui.h} here.
+@item Standard deviation of sample (i.e., @emph{Standard deviation})
+Measured Standard deviation from a sampling of the ideal distribution.
+This is the second column of @file{mean-stds.txt} above and is shown with
@mymath{\sigma_x} above.
+In astronomical literature, this is simply referred to as the ``standard
deviation''.
+In other words, the standard deviation is computed on the input itself and
MakeCatalog just needs a ``values'' file.
+For example, when measuring the standard deviation of an astronomical object
using MakeCatalog it is computed directly from the input values.
-@item columns.c
-This file contains the main definition and high-level calculation of your new
column through the @code{columns_define_alloc} and @code{columns_fill}
functions.
-In the first, you specify the basic information about the column: its name,
units, comments, type (see @ref{Numeric data types}) and how it should be
printed if the output is a text file.
-You should also specify the raw/internal columns that are necessary for this
column here as the many existing examples show.
-Through the types for objects and rows, you can specify if this column is only
for clumps, objects or both.
+@item Standard error (i.e., @emph{error})
+Measurable scatter of measuring the mean (@mymath{\widehat\sigma_{\bar{x}}})
that can be estimated from the size of the sample and the measured standard
deviation (@mymath{\sigma_x}).
+In astronomical literature, this is simply referred to as the ``error''.
-The second main function (@code{columns_fill}) writes the final value into the
appropriate column for each object and clump.
-As you can see in the many existing examples, you can define your processing
on the raw/internal calculations here and save them in the output.
+In other words, when asking for an ``error'' measurement with MakeCatalog, a
separate standard deviation dataset should be always provided.
+This dataset should take into account all sources of scatter.
+For example, during the reduction of an image, the standard deviation dataset
should take into account the dispersion of each pixel that cames from the bias,
dark, flat fielding, etc.
+If this image is not available, it is possible to use the @code{SKY_STD}
extension from NoiseChisel as an estimation.
+For more see @ref{NoiseChisel output}.
+@end table
-@item mkcatalog.c
-This file contains the low-level parsing functions.
-To be optimized, the parsing is done in parallel through the
@code{mkcatalog_single_object} function.
-This function initializes the necessary arrays and calls the lower-level
@code{parse_objects} and @code{parse_clumps} for actually going over the pixels.
-They are all heavily commented, so you should be able to follow where to add
your necessary low-level calculations.
+@node Magnitude measurement error of each detection, Surface brightness error
of each detection, Standard deviation vs error, Quantifying measurement limits
+@subsubsection Magnitude measurement error of each detection
+The raw error in measuring the magnitude is only meaningful when the object's
magnitude is brighter than the upper-limit magnitude (see below).
+As discussed in @ref{Brightness flux magnitude}, the magnitude (@mymath{M}) of
an object with brightness @mymath{B} and zero point magnitude @mymath{z} can be
written as:
-@item doc/gnuastro.texi
-Update this manual and add a description for the new column.
+@dispmath{M=-2.5\log_{10}(B)+z}
-@end table
+@noindent
+Calculating the derivative with respect to @mymath{B}, we get:
+@dispmath{{dM\over dB} = {-2.5\over {B\times ln(10)}}}
+@noindent
+From the Tailor series (@mymath{\Delta{M}=dM/dB\times\Delta{B}}), we can write:
+@dispmath{\Delta{M} = \left|{-2.5\over ln(10)}\right|\times{\Delta{B}\over{B}}}
-@node MakeCatalog measurements, Invoking astmkcatalog, Adding new columns to
MakeCatalog, MakeCatalog
-@subsection MakeCatalog measurements
+@noindent
+But, @mymath{\Delta{B}/B} is just the inverse of the Signal-to-noise ratio
(@mymath{S/N}), so we can write the error in magnitude in terms of the
signal-to-noise ratio:
-MakeCatalog's output measurements/columns can be specified using command-line
options (@ref{Options}).
-The current measurements in MakeCatalog are those which only produce one final
value for each label (for example, its magnitude: a single number).
-All the different label's measurements can be written as one column in a final
table/catalog that contains other columns for other similar single-number
measurements.
+@dispmath{ \Delta{M} = {2.5\over{S/N\times ln(10)}} }
-In this case, all the different label's measurements can be written as one
column in a final table/catalog that contains other columns for other similar
single-number measurements.
-The majority of this section is devoted to MakeCatalog's single-valued
measurements.
-However, MakeCatalog can also do measurements that produce more than one value
for each label.
-Currently the only such measurement is generation of spectra from 3D cubes
with the @option{--spectrum} option and it is discussed in the end of this
section.
+MakeCatalog uses this relation to estimate the magnitude errors.
+The signal-to-noise ratio is calculated in different ways for clumps and
objects (see @url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa
[2015]}), but this single equation can be used to estimate the measured
magnitude error afterwards for any type of target.
-Command-line options are used to identify which measurements you want in the
final catalog(s) and in what order.
-If any of the options below is called on the command-line or in any of the
configuration files, it will be included as a column in the output catalog.
-The order of the columns is in the same order as the options were seen by
MakeCatalog (see @ref{Configuration file precedence}).
-Some of the columns apply to both ``objects'' and ``clumps'' and some are
particular to only one of them (for the definition of ``objects'' and
``clumps'', see @ref{Segment}).
-Columns/options that are unique to one catalog (only objects, or only clumps),
are explicitly marked with [Objects] or [Clumps] to specify the catalog they
will be placed in.
+@node Surface brightness error of each detection, Completeness limit of each
detection, Magnitude measurement error of each detection, Quantifying
measurement limits
+@subsubsection Surface brightness error of each detection
-@menu
-* Identifier columns:: Identifying labels of each row (object/clumps).
-* Position measurements in pixels:: Containing image/pixel (X/Y) measurements.
-* Position measurements in WCS:: Containing WCS (for example RA/Dec)
measurements.
-* Brightness measurements:: Using pixel values of each label.
-* Surface brightness measurements:: Various ways to measure surface
brightness.
-* Morphology measurements nonparametric:: Non-parametric morphology.
-* Morphology measurements elliptical:: Elliptical morphology measurements.
-* Measurements per slice spectra:: Measurements on each slice (like spectra).
-@end menu
+@cindex Surface brightness error
+@cindex Error in surface brightness
+We can derive the error in measuring the surface brightness based on the
surface brightness (SB) equation of @ref{Brightness flux magnitude} and the
generic magnitude error (@mymath{\Delta{M}}) of @ref{Magnitude measurement
error of each detection}.
+Let's set @mymath{A} to represent the area and @mymath{\Delta{A}} to represent
the error in measuring the area.
+For more on @mymath{\Delta{A}}, see the description of
@option{--spatialresolution} in @ref{MakeCatalog inputs and basic settings}.
-@node Identifier columns, Position measurements in pixels, MakeCatalog
measurements, MakeCatalog measurements
-@subsubsection Identifier columns
+@dispmath{\Delta{(SB)} = \Delta{M} + \left|{-2.5\over
ln(10)}\right|\times{\Delta{A}\over{A}}}
-The identifier of each row (group of measurements) is usually the first thing
you will be requesting from MakeCatalog.
-Without the identifier, it is not clear which measurement corresponds to which
label for the input.
+In the surface brightness equation mentioned above, @mymath{A} is in units of
arcsecond squared and the conversion between arcseconds to pixels is a
multiplication factor.
+Therefore as long as @mymath{A} and @mymath{\Delta{A}} have the same units, it
does not matter if they are in arcseconds or pixels.
+Since the measure of spatial resolution (or area error) is the FWHM of the PSF
which is usually defined in terms of pixels, its more intuitive to use pixels
for @mymath{A} and @mymath{\Delta{A}}.
-Since MakeCatalog can also optionally take sub-structure label (clumps; see
@ref{Segment}), there are various identifiers in general that are listed below.
-The most generic (and shortest and easiest to type!) is the @option{--ids}
option which can be used in object-only or object-clump catalogs.
+@node Completeness limit of each detection, Upper limit magnitude of each
detection, Surface brightness error of each detection, Quantifying measurement
limits
+@subsubsection Completeness limit of each detection
+@cindex Completeness
+As the surface brightness of the objects decreases, the ability to detect them
will also decrease.
+An important statistic is thus the fraction of objects of similar morphology
and magnitude that will be detected with our detection algorithm/parameters in
a given image.
+This fraction is known as @emph{completeness}.
+For brighter objects, completeness is 1: all bright objects that might exist
over the image will be detected.
+However, as we go to objects of lower overall surface brightness, we will fail
to detect a fraction of them, and fainter than a certain surface brightness
level (for each morphology),nothing will be detectable in the image: you will
need more data to construct a ``deeper'' image.
+For a given profile and dataset, the magnitude where the completeness drops
below a certain level (usually above @mymath{90\%}) is known as the
completeness limit.
-@table @option
-@item --i
-@itemx --ids
-This is a unique option which can add multiple columns to the final catalog(s).
-Calling this option will put the object IDs (@option{--obj-id}) in the objects
catalog and host-object-ID (@option{--host-obj-id}) and ID-in-host-object
(@option{--id-in-host-obj}) into the clumps catalog.
-Hence if only object catalogs are required, it has the same effect as
@option{--obj-id}.
+@cindex Purity
+@cindex False detections
+@cindex Detections false
+Another important parameter in measuring completeness is purity: the fraction
of true detections to all true detections.
+In effect purity is the measure of contamination by false detections: the
higher the purity, the lower the contamination.
+Completeness and purity are anti-correlated: if we can allow a large number of
false detections (that we might be able to remove by other means), we can
significantly increase the completeness limit.
-@item --obj-id
-[Objects] ID of this object.
+One traditional way to measure the completeness and purity of a given sample
is by embedding mock profiles in regions of the image with no detection.
+However in such a study we must be really careful to choose model profiles as
similar to the target of interest as possible.
-@item -j
-@itemx --host-obj-id
-[Clumps] The ID of the object which hosts this clump.
-@item --id-in-host-obj
-[Clumps] The ID of this clump in its host object.
-@end table
-@node Position measurements in pixels, Position measurements in WCS,
Identifier columns, MakeCatalog measurements
-@subsubsection Position measurements in pixels
+@node Upper limit magnitude of each detection, Magnitude limit of image,
Completeness limit of each detection, Quantifying measurement limits
+@subsubsection Upper limit magnitude of each detection
+Due to the noisy nature of data, it is possible to get arbitrarily faint
magnitudes, especially when you use labels from another image (for example see
@ref{Working with catalogs estimating colors}).
+Given the scatter caused by the dataset's noise, values fainter than a certain
level are meaningless: another similar depth observation will give a radically
different value.
+In such cases, measurements like the image magnitude limit are not useful
because it is estimated for a certain morphology and is given for the whole
image (it is a crude generalization; see see @ref{Magnitude limit of image}).
+You want a quality measure that is specific to each object.
-The position of a labeled region within your input dataset (in its own units)
can be measured with the options in this section.
-By ``in its own units'' we mean pixels in a 2D image or voxels in a 3D cube.
-For example if the flux-weighted center of a label lies 123 pixels on the
horizontal and 456 pixels on the vertical, the @option{--x} and @option{--y}
options will put a value of 123 and 456 in their respective columns.
-As you see below, there are various ways to define the ``position'' of an
object, so read the differences carefully to choose the one that corresponds
best to your usage.
+For example, assume that you have done your detection and segmentation on one
filter and now you do measurements over the same labeled regions, but on other
filters to measure colors (as we did in the tutorial @ref{Segmentation and
making a catalog}).
+Some objects are not going to have any significant signal in the other
filters, but for example, you measure magnitude of 36 for one of them!
+This is clearly unreliable (no dataset in current astronomy is able to detect
such a faint signal).
+In another image with the same depth, using the same filter, you might measure
a magnitude of 30 for it, and yet another might give you 33.
+Furthermore, the total sum of pixel values might actually be negative in some
images of the same depth (due to noise).
+In these cases, no magnitude can be defined and MakeCatalog will place a NaN
there (recall that a magnitude is a base-10 logarithm).
-@table @option
-@item -x
-@itemx --x
-The flux weighted center of all objects and clumps along the first FITS axis
(horizontal when viewed in SAO DS9), see @mymath{\overline{x}} in
@ref{Measuring elliptical parameters}.
-The weight has to have a positive value (pixel value larger than the Sky
value) to be meaningful! Specially when doing matched photometry, this might
not happen: no pixel value might be above the Sky value.
-For such detections, the geometric center will be reported in this column (see
@option{--geo-x}).
-You can use @option{--weight-area} to see which was used.
+@cindex Upper limit magnitude
+@cindex Magnitude, upper limit
+Using such unreliable measurements will directly affect our analysis, so we
must not use the raw measurements.
+When approaching the limits of your detection method, it is therefore
important to be able to identify such cases.
+But how can we know how reliable a measurement of one object on a given
dataset is?
-@item -y
-@itemx --y
-The flux weighted center of all objects and clumps along the second FITS axis
(vertical when viewed in SAO DS9).
-See @option{--x}.
+When we confront such unreasonably faint magnitudes, there is one thing we can
deduce: that if something actually exists under our labeled pixels (possibly
buried deep under the noise), it's inherent magnitude is fainter than an
@emph{upper limit magnitude}.
+To find this upper limit magnitude, we place the object's footprint
(segmentation map) over a random part of the image where there are no
detections, and measure the sum of pixel values within the footprint.
+Doing this a large number of times will give us a distribution of measurements
of the sum.
+The standard deviation (@mymath{\sigma}) of that distribution can be used to
quantify the upper limit magnitude for that particular object (given its
particular shape and area):
-@item -z
-@itemx --z
-The flux weighted center of all objects and clumps along the third FITS
-axis. See @option{--x}.
+@dispmath{M_{up,n\sigma}=-2.5\times\log_{10}{(n\sigma_m)}+z \quad\quad
[mag/target]}
-@item --geo-x
-The geometric center of all objects and clumps along the first FITS axis axis.
-The geometric center is the average pixel positions irrespective of their
pixel values.
+@cindex Correlated noise
+Traditionally, faint/small object photometry was done using fixed circular
apertures (for example, with a diameter of @mymath{N} arc-seconds) and there
was not much processing involved (to make a deep stack).
+Hence, the upper limit was synonymous with the surface brightness limit
discussed above: one value for the whole image.
+The problem with this simplified approach is that the number of pixels in the
aperture directly affects the final distribution and thus magnitude.
+Also the image correlated noise might actually create certain patterns, so the
shape of the object can also affect the final result.
+Fortunately, with the much more advanced hardware and software of today, we
can make customized segmentation maps (footprint) for each object and have
enough computing power to actually place that footprint over many random places.
+As a result, the per-target upper-limit magnitude and general surface
brightness limit have diverged.
-@item --geo-y
-The geometric center of all objects and clumps along the second FITS axis
axis, see @option{--geo-x}.
+When any of the upper-limit-related columns requested, MakeCatalog will
randomly place each target's footprint over the undetected parts of the dataset
as described above, and estimate the required properties.
+The procedure is fully configurable with the options in @ref{Upper-limit
settings}.
+You can get the full list of upper-limit related columns of MakeCatalog with
this command (the extra @code{--} before @code{--upperlimit} is
necessary@footnote{Without the extra @code{--}, grep will assume that
@option{--upperlimit} is one of its own options, and will thus abort,
complaining that it has no option with this name.}):
-@item --geo-z
-The geometric center of all objects and clumps along the third FITS axis axis,
see @option{--geo-x}.
+@example
+$ astmkcatalog --help | grep -- --upperlimit
+@end example
-@item --min-val-x
-Position of pixel with minimum value in objects and clumps, along the first
FITS axis.
+@node Magnitude limit of image, Surface brightness limit of image, Upper limit
magnitude of each detection, Quantifying measurement limits
+@subsubsection Magnitude limit of image
-@item --max-val-x
-Position of pixel with maximum value in objects and clumps, along the first
FITS axis.
+@cindex Magnitude limit
+Suppose we have taken two images of the same field of view with the same CCD,
once with a smaller telescope, and once with a larger one.
+Because we used the same CCD, the noise will be very similar.
+However, the larger telescope has gathered more light, therefore the same star
or galaxy will have a higher signal-to-noise ratio (S/N) in the image taken
with the larger one.
+The same applies for a stacked image of the field compared to a
single-exposure image of the same telescope.
-@item --min-val-y
-Position of pixel with minimum value in objects and clumps, along the first
FITS axis.
+This concept is used by some researchers to define the ``magnitude limit'' or
``detection limit'' at a certain S/N (sometimes 10, 5 or 3 for example, also
written as @mymath{10\sigma}, @mymath{5\sigma} or @mymath{3\sigma}).
+To do this, they measure the magnitude and signal-to-noise ratio of all the
objects within an image and measure the mean (or median) magnitude of objects
at the desired S/N.
+A fully working example of deriving the magnitude limit is available in the
tutorials section: @ref{Measuring the dataset limits}.
-@item --max-val-y
-Position of pixel with maximum value in objects and clumps, along the first
FITS axis.
+However, this method should be used with extreme care!
+This is because the shape of the object becomes important in this method: a
sharper object will have a higher @emph{measured} S/N compared to a more
diffuse object at the same original magnitude.
+Besides the inherent shape/sharpness of the object, issues like the PSF also
become important in this method (because the finally observed shapes of objects
are important here): two surveys with the same surface brightness limit (see
@ref{Surface brightness limit of image}) will have different magnitude limits
if one is taken from space and the other from the ground.
-@item --min-val-z
-Position of pixel with minimum value in objects and clumps, along the first
FITS axis.
+@node Surface brightness limit of image, Upper limit magnitude of image,
Magnitude limit of image, Quantifying measurement limits
+@subsubsection Surface brightness limit of image
+@cindex Surface brightness
+As we make more observations on one region of the sky and add/combine the
observations into one dataset, both the signal and the noise increase.
+However, the signal increases much faster than the noise:
+Assuming you add @mymath{N} datasets with equal exposure times, the signal
will increases as a multiple of @mymath{N}, while noise increases as
@mymath{\sqrt{N}}.
+Therefore the signal-to-noise ratio increases by a factor of @mymath{\sqrt{N}}.
+Visually, fainter (per pixel) parts of the objects/signal in the image will
become more visible/detectable.
+The noise-level is known as the dataset's surface brightness limit.
-@item --max-val-z
-Position of pixel with maximum value in objects and clumps, along the first
FITS axis.
+You can think of the noise as muddy water that is completely covering a flat
ground@footnote{The ground is the sky value in this analogy, see @ref{Sky
value}.
+Note that this analogy only holds for a flat sky value across the surface of
the image or ground.}.
+The signal (coming from astronomical objects in real data) will be
summits/hills that start from the flat sky level (under the muddy water) and
their summits can sometimes reach above the muddy water.
+Let's assume that in your first observation the muddy water has just been
stirred and except a few small peaks, you cannot see anything through the mud.
+As you wait and make more observations/exposures, the mud settles down and the
@emph{depth} of the transparent water increases.
+As a result, more and more summits become visible and the lower parts of the
hills (parts with lower surface brightness) can be seen more clearly.
+In this analogy@footnote{Note that this muddy water analogy is not perfect,
because while the water-level remains the same all over a peak, in data
analysis, the Poisson noise increases with the level of data.}, height (from
the ground) is the @emph{surface brightness} and the height of the muddy water
at the moment you combine your data, is your @emph{surface brightness limit}
for that moment.
-@item --min-x
-The minimum position of all objects and clumps along the first FITS axis.
+@cindex Data's depth
+The outputs of NoiseChisel include the Sky standard deviation
(@mymath{\sigma}) on every group of pixels (a tile) that were calculated from
the undetected pixels in each tile, see @ref{Tessellation} and @ref{NoiseChisel
output}.
+Let's take @mymath{\sigma_m} as the median @mymath{\sigma} over the successful
meshes in the image (prior to interpolation or smoothing).
+It is recorded in the @code{MEDSTD} keyword of the @code{SKY_STD} extension of
NoiseChisel's output.
-@item --max-x
-The maximum position of all objects and clumps along the first FITS axis.
+@cindex ACS camera
+@cindex Surface brightness limit
+@cindex Limit, surface brightness
+On different instruments, pixels cover different spatial angles over the sky.
+For example, the width of each pixel on the ACS camera on the Hubble Space
Telescope (HST) is roughly 0.05 seconds of arc, while the pixels of SDSS are
each 0.396 seconds of arc (almost eight times wider@footnote{Ground-based
instruments like the SDSS suffer from strong smoothing due to the atmosphere.
+Therefore, increasing the pixel resolution (or decreasing the width of a
pixel) will not increase the received information).}).
+Nevertheless, irrespective of its sky coverage, a pixel is our unit of data
collection.
-@item --min-y
-The minimum position of all objects and clumps along the second FITS axis.
+To start with, we define the low-level Surface brightness limit or
@emph{depth}, in units of magnitude/pixel with the equation below (assuming the
image has zero point magnitude @mymath{z} and we want the @mymath{n}th multiple
of @mymath{\sigma_m}).
-@item --max-y
-The maximum position of all objects and clumps along the second FITS axis.
+@dispmath{SB_{n\sigma,\rm pixel}=-2.5\times\log_{10}{(n\sigma_m)}+z \quad\quad
[mag/pixel]}
-@item --min-z
-The minimum position of all objects and clumps along the third FITS axis.
+@cindex XDF survey
+@cindex CANDELS survey
+@cindex eXtreme Deep Field (XDF) survey
+As an example, the XDF survey covers part of the sky that the HST has observed
the most (for 85 orbits) and is consequently very small (@mymath{\sim4} minutes
of arc, squared).
+On the other hand, the CANDELS survey, is one of the widest multi-color
surveys done by the HST covering several fields (about 720 arcmin@mymath{^2})
but its deepest fields have only 9 orbits observation.
+The @mymath{1\sigma} depth of the XDF and CANDELS-deep surveys in the near
infrared WFC3/F160W filter are respectively 34.40 and 32.45 magnitudes/pixel.
+In a single orbit image, this same field has a @mymath{1\sigma} depth of 31.32
magnitudes/pixel.
+Recall that a larger magnitude corresponds to fainter objects, see
@ref{Brightness flux magnitude}.
-@item --max-z
-The maximum position of all objects and clumps along the third FITS axis.
+@cindex Pixel scale
+The low-level magnitude/pixel measurement above is only useful when all the
datasets you want to use, or compare, have the same pixel size.
+However, you will often find yourself using, or comparing, datasets from
various instruments with different pixel scales (projected pixel width, in
arc-seconds).
+If we know the pixel scale, we can obtain a more easily comparable surface
brightness limit in units of: magnitude/arcsec@mymath{^2}.
+But another complication is that astronomical objects are usually larger than
1 arcsec@mymath{^2}.
+As a result, it is common to measure the surface brightness limit over a
larger (but fixed, depending on context) area.
-@item --clumps-x
-[Objects] The flux weighted center of all the clumps in this object along the
first FITS axis.
-See @option{--x}.
+Let's assume that every pixel is @mymath{p} arcsec@mymath{^2} and we want the
surface brightness limit for an object covering A arcsec@mymath{^2} (so
@mymath{A/p} is the number of pixels that cover an area of @mymath{A}
arcsec@mymath{^2}).
+On the other hand, noise is added in RMS@footnote{If you add three datasets
with noise @mymath{\sigma_1}, @mymath{\sigma_2} and @mymath{\sigma_3}, the
resulting noise level is
@mymath{\sigma_t=\sqrt{\sigma_1^2+\sigma_2^2+\sigma_3^2}}, so when
@mymath{\sigma_1=\sigma_2=\sigma_3\equiv\sigma}, then
@mymath{\sigma_t=\sigma\sqrt{3}}.
+In this case, the area @mymath{A} is covered by @mymath{A/p} pixels, so the
noise level is @mymath{\sigma_t=\sigma\sqrt{A/p}}.}, hence the noise level in
@mymath{A} arcsec@mymath{^2} is @mymath{n\sigma_m\sqrt{A/p}}.
+But we want the result in units of arcsec@mymath{^2}, so we should divide this
by @mymath{A} arcsec@mymath{^2}:
+@mymath{n\sigma_m\sqrt{A/p}/A=n\sigma_m\sqrt{A/(pA^2)}=n\sigma_m/\sqrt{pA}}.
+Plugging this into the magnitude equation, we get the @mymath{n\sigma} surface
brightness limit, over an area of A arcsec@mymath{^2}, in units of
magnitudes/arcsec@mymath{^2}:
-@item --clumps-y
-[Objects] The flux weighted center of all the clumps in this object along the
second FITS axis.
-See @option{--x}.
+@dispmath{SB_{{n\sigma,\rm A
arcsec}^2}=-2.5\times\log_{10}{\left(n\sigma_m\over \sqrt{pA}\right)+z}
\quad\quad [mag/arcsec^2]}
-@item --clumps-z
-[Objects] The flux weighted center of all the clumps in this object along the
third FITS axis.
-See @option{--x}.
+@cindex World Coordinate System (WCS)
+MakeCatalog will calculate the input dataset's @mymath{SB_{n\sigma,\rm pixel}}
and @mymath{SB_{{n\sigma,\rm A arcsec}^2}} and will write them as the
@code{SBLMAGPIX} and @code{SBLMAG} keywords the output catalog(s), see
@ref{MakeCatalog output}.
+You can set your desired @mymath{n}-th multiple of @mymath{\sigma} and the
@mymath{A} arcsec@mymath{^2} area using the following two options respectively:
@option{--sfmagnsigma} and @option{--sfmagarea} (see @ref{MakeCatalog output}).
+Just note that @mymath{SB_{{n\sigma,\rm A arcsec}^2}} is only calculated if
the input has World Coordinate System (WCS).
+Without WCS, the pixel scale cannot be derived.
-@item --clumps-geo-x
-[Objects] The geometric center of all the clumps in this object along the
first FITS axis.
-See @option{--geo-x}.
+@cindex Correlated noise
+@cindex Noise, correlated
+As you saw in its derivation, the calculation above extrapolates the noise in
one pixel over all the input's pixels!
+It therefore implicitly assumes that the noise is the same in all of the
pixels.
+But this only happens in individual exposures: reduced data will have
correlated noise because they are a stack of many individual exposures that
have been warped (thus mixing the pixel values).
+A more accurate measure which will provide a realistic value for every labeled
region is known as the @emph{upper-limit magnitude}, which is discussed below.
-@item --clumps-geo-y
-[Objects] The geometric center of all the clumps in this object along the
second FITS axis.
-See @option{--geo-x}.
-@item --clumps-geo-z
-[Objects] The geometric center of all the clumps in this object along
-the third FITS axis. See @option{--geo-z}.
-@end table
+@node Upper limit magnitude of image, , Surface brightness limit of image,
Quantifying measurement limits
+@subsubsection Upper limit magnitude of image
+As mentioned in @ref{Upper limit magnitude of each detection}, the upper-limit
magnitude will depend on the shape of each object's footprint.
+Therefore we can measure a dataset's upper-limit magnitude using standard
shapes.
-@node Position measurements in WCS, Brightness measurements, Position
measurements in pixels, MakeCatalog measurements
-@subsubsection Position measurements in WCS
+Traditionally a circular aperture of a fixed size (in arcseconds) has been
used.
+For a full example of implementing this, see the respective section in the
tutorial (@ref{Image surface brightness limit}).
-The position of a labeled region within your input dataset (in the World
Coordinate System, or WCS) can be measured with the options in this section.
-As you see below, there are various ways to define the ``position'' of an
object, so read the differences carefully to choose the one that corresponds
best to your usage.
-The most common WCS coordinates are Right Ascension (RA) and Declination in an
equatorial system.
-Therefore, to simplify their usage, we have special @option{--ra} and
@option{--dec} options.
-However, the WCS of datasets are in Galactic coordinates, so to be generic,
you can use the @option{--w1}, @option{--w2} or @option{--w3} (if you have a 3D
cube) options.
-In case your dataset's WCS is not in your desired system (for example it is
Galactic, but you want equatorial 2000), you can use the @option{--wcscoordsys}
option of Gnuastro's Fits program on the labeled image before running
MakeCatalog (see @ref{Keyword inspection and manipulation}).
-@table @option
-@item -r
-@itemx --ra
-Flux weighted right ascension of all objects or clumps, see @option{--x}.
-This is just an alias for one of the lower-level @option{--w1} or
@option{--w2} options.
-Using the FITS WCS keywords (@code{CTYPE}), MakeCatalog will determine which
axis corresponds to the right ascension.
-If no @code{CTYPE} keywords start with @code{RA}, an error will be printed
when requesting this column and MakeCatalog will abort.
-@item -d
-@itemx --dec
-Flux weighted declination of all objects or clumps, see @option{--x}.
-This is just an alias for one of the lower-level @option{--w1} or
@option{--w2} options.
-Using the FITS WCS keywords (@code{CTYPE}), MakeCatalog will determine which
axis corresponds to the declination.
-If no @code{CTYPE} keywords start with @code{DEC}, an error will be printed
when requesting this column and MakeCatalog will abort.
-@item --w1
-Flux weighted first WCS axis of all objects or clumps, see @option{--x}.
-The first WCS axis is commonly used as right ascension in images.
-@item --w2
-Flux weighted second WCS axis of all objects or clumps, see @option{--x}.
-The second WCS axis is commonly used as declination in images.
-@item --w3
-Flux weighted third WCS axis of all objects or clumps, see
-@option{--x}. The third WCS axis is commonly used as wavelength in integral
-field unit data cubes.
-@item --geo-w1
-Geometric center in first WCS axis of all objects or clumps, see
@option{--geo-x}.
-The first WCS axis is commonly used as right ascension in images.
-@item --geo-w2
-Geometric center in second WCS axis of all objects or clumps, see
@option{--geo-x}.
-The second WCS axis is commonly used as declination in images.
-@item --geo-w3
-Geometric center in third WCS axis of all objects or clumps, see
-@option{--geo-x}. The third WCS axis is commonly used as wavelength in
-integral field unit data cubes.
+@node Measuring elliptical parameters, Adding new columns to MakeCatalog,
Quantifying measurement limits, MakeCatalog
+@subsection Measuring elliptical parameters
-@item --clumps-w1
-[Objects] Flux weighted center in first WCS axis of all clumps in this object,
see @option{--x}.
-The first WCS axis is commonly used as right ascension in images.
+The shape or morphology of a target is one of the most commonly desired
parameters of a target.
+Here, we will review the derivation of the most basic/simple morphological
parameters: the elliptical parameters for a set of labeled pixels.
+The elliptical parameters are: the (semi-)major axis, the (semi-)minor axis
and the position angle along with the central position of the profile.
+The derivations below follow the SExtractor manual derivations with some added
explanations for easier reading.
-@item --clumps-w2
-[Objects] Flux weighted declination of all clumps in this object, see
@option{--x}.
-The second WCS axis is commonly used as declination in images.
+@cindex Moments
+Let's begin with one dimension for simplicity: Assume we have a set of
@mymath{N} values @mymath{B_i} (for example, showing the spatial distribution
of a target's brightness), each at position @mymath{x_i}.
+The simplest parameter we can define is the geometric center of the object
(@mymath{x_g}) (ignoring the brightness values): @mymath{x_g=(\sum_ix_i)/N}.
+@emph{Moments} are defined to incorporate both the value (brightness) and
position of the data.
+The first moment can be written as:
-@item --clumps-w3
-[Objects] Flux weighted center in third WCS axis of all clumps in this object,
see @option{--x}.
-The third WCS axis is commonly used as wavelength in integral field unit data
cubes.
+@dispmath{\overline{x}={\sum_iB_ix_i \over \sum_iB_i}}
-@item --clumps-geo-w1
-[Objects] Geometric center right ascension of all clumps in this object, see
@option{--geo-x}.
-The first WCS axis is commonly used as right ascension in images.
+@cindex Variance
+@cindex Second moment
+@noindent
+This is essentially the weighted (by @mymath{B_i}) mean position.
+The geometric center (@mymath{x_g}, defined above) is a special case of this
with all @mymath{B_i=1}.
+The second moment is essentially the variance of the distribution:
-@item --clumps-geo-w2
-[Objects] Geometric center declination of all clumps in this object, see
@option{--geo-x}.
-The second WCS axis is commonly used as declination in images.
+@dispmath{\overline{x^2}\equiv{\sum_iB_i(x_i-\overline{x})^2 \over
+ \sum_iB_i} = {\sum_iB_ix_i^2 \over \sum_iB_i} -
+ 2\overline{x}{\sum_iB_ix_i\over\sum_iB_i} + \overline{x}^2
+ ={\sum_iB_ix_i^2 \over \sum_iB_i} - \overline{x}^2}
-@item --clumps-geo-w3
-[Objects] Geometric center in third WCS axis of all clumps in this object, see
@option{--geo-x}.
-The third WCS axis is commonly used as wavelength in integral field unit data
cubes.
-@end table
+@cindex Standard deviation
+@noindent
+The last step was done from the definition of @mymath{\overline{x}}.
+Hence, the square root of @mymath{\overline{x^2}} is the spatial standard
deviation (along the one-dimension) of this particular brightness distribution
(@mymath{B_i}).
+Crudely (or qualitatively), you can think of its square root as the distance
(from @mymath{\overline{x}}) which contains a specific amount of the flux
(depending on the @mymath{B_i} distribution).
+Similar to the first moment, the geometric second moment can be found by
setting all @mymath{B_i=1}.
+So while the first moment quantified the position of the brightness
distribution, the second moment quantifies how that brightness is dispersed
about the first moment.
+In other words, it quantifies how ``sharp'' the object's image is.
-@node Brightness measurements, Surface brightness measurements, Position
measurements in WCS, MakeCatalog measurements
-@subsubsection Brightness measurements
+@cindex Floating point error
+Before continuing to two dimensions and the derivation of the elliptical
parameters, let's pause for an important implementation technicality.
+You can ignore this paragraph and the next two if you do not want to implement
these concepts.
+The basic definition (first definition of @mymath{\overline{x^2}} above) can
be used without any major problem.
+However, using this fraction requires two runs over the data: one run to find
@mymath{\overline{x}} and another run to find @mymath{\overline{x^2}} from
@mymath{\overline{x}}, this can be slow.
+The advantage of the last fraction above, is that we can estimate both the
first and second moments in one run (since the @mymath{-\overline{x}^2} term
can easily be added later).
-Within an image, pixels have both a position and a value.
-In the sections above all the measurements involved position (see
@ref{Position measurements in pixels} or @ref{Position measurements in WCS}).
-The measurements in this section only deal with pixel values and ignore the
pixel positions completely.
-In other words, for the options of this section each labeled region within the
input is just a group of values (and their associated error values if
necessary), and they let you do various types of measurements on the resulting
distribution of values.
+The logarithmic nature of floating point number digitization creates a
complication however: suppose the object is located between pixels 10000 and
10020.
+Hence the target's pixels are only distributed over 20 pixels (with a standard
deviation @mymath{<20}), while the mean has a value of @mymath{\sim10000}.
+The @mymath{\sum_iB_i^2x_i^2} will go to very very large values while the
individual pixel differences will be orders of magnitude smaller.
+This will lower the accuracy of our calculation due to the limited accuracy of
floating point operations.
+The variance only depends on the distance of each point from the mean, so we
can shift all position by a constant/arbitrary @mymath{K} which is much closer
to the mean: @mymath{\overline{x-K}=\overline{x}-K}.
+Hence we can calculate the second order moment using:
-@table @option
-@item --sum
-The sum of all pixel values associated to this label (object or clump).
-Note that if a sky value or image has been given, it will be subtracted before
any column measurement.
-For clumps, the ambient values (average of river pixels around the clump,
multiplied by the area of the clump) is subtracted, see @option{--river-mean}.
-So the sum of all the clump-sums in the clump catalog of one object will be
smaller than the @option{--clumps-sum} column of the objects catalog.
+@dispmath{ \overline{x^2}={\sum_iB_i(x_i-K)^2 \over \sum_iB_i} -
+ (\overline{x}-K)^2 }
-If no usable pixels are present over the clump or object (for example, they
are all blank), the returned value will be NaN (note that zero is meaningful).
+@noindent
+The closer @mymath{K} is to @mymath{\overline{x}}, the better (the sums of
squares will involve smaller numbers), as long as @mymath{K} is within the
object limits (in the example above: @mymath{10000\leq{K}\leq10020}), the
floating point error induced in our calculation will be negligible.
+For the most simplest implementation, MakeCatalog takes @mymath{K} to be the
smallest position of the object in each dimension.
+Since @mymath{K} is arbitrary and an implementation/technical detail, we will
ignore it for the remainder of this discussion.
-@item --sum-error
-The (@mymath{1\sigma}) error in measuring the sum of values of a label
(objects or clumps).
+In two dimensions, the mean and variances can be written as:
-The returned value will be NaN when the label covers only NaN pixels in the
values image, or a pixel is NaN in the @option{--instd} image, but non-NaN in
the values image.
-The latter situation usually happens when there is a bug in the previous steps
of your analysis, and is important because those pixels with a NaN in the
@option{--instd} image may contribute significantly to the final error.
-If you want to ignore those pixels in the error measurement, set them to zero
(which is a meaningful number in such scenarios).
+@dispmath{\overline{x}={\sum_iB_ix_i\over B_i}, \quad
+ \overline{x^2}={\sum_iB_ix_i^2 \over \sum_iB_i} -
+ \overline{x}^2}
+@dispmath{\overline{y}={\sum_iB_iy_i\over B_i}, \quad
+ \overline{y^2}={\sum_iB_iy_i^2 \over \sum_iB_i} -
+ \overline{y}^2}
+@dispmath{\quad\quad\quad\quad\quad\quad\quad\quad\quad
+ \overline{xy}={\sum_iB_ix_iy_i \over \sum_iB_i} -
+ \overline{x}\times\overline{y}}
-@item --clumps-sum
-[Objects] The total sum of the pixels covered by clumps (before subtracting
the river) within each object.
-This is simply the sum of @option{--sum-no-river} in the clumps catalog (see
below).
-If no usable pixels are present over the clump or object (for example, they
are all blank), the stored value will be NaN (note that zero is meaningful).
+If an elliptical profile's major axis exactly lies along the @mymath{x} axis,
then @mymath{\overline{x^2}} will be directly proportional with the profile's
major axis, @mymath{\overline{y^2}} with its minor axis and
@mymath{\overline{xy}=0}.
+However, in reality we are not that lucky and (assuming galaxies can be
parameterized as an ellipse) the major axis of galaxies can be in any direction
on the image (in fact this is one of the core principles behind weak-lensing by
shear estimation).
+So the purpose of the remainder of this section is to define a strategy to
measure the position angle and axis ratio of some randomly positioned ellipses
in an image, using the raw second moments that we have calculated above in our
image coordinates.
-@item --sum-no-river
-[Clumps] The sum of Sky (not river) subtracted clump pixel values.
-By definition, for the clumps, the average value of the rivers surrounding it
are subtracted from it for a first order accounting for contamination by
neighbors.
+Let's assume we have rotated the galaxy by @mymath{\theta}, the new second
order moments are:
-If no usable pixels are present over the clump or object (for example, they
are all blank), the stored value will be NaN (note that zero is meaningful).
+@dispmath{\overline{x_\theta^2} = \overline{x^2}\cos^2\theta +
+ \overline{y^2}\sin^2\theta -
+ 2\overline{xy}\cos\theta\sin\theta }
+@dispmath{\overline{y_\theta^2} = \overline{x^2}\sin^2\theta +
+ \overline{y^2}\cos^2\theta +
+ 2\overline{xy}\cos\theta\sin\theta}
+@dispmath{\overline{xy_\theta} = \overline{x^2}\cos\theta\sin\theta -
+ \overline{y^2}\cos\theta\sin\theta +
+ \overline{xy}(\cos^2\theta-\sin^2\theta)}
-@item --mean
-The mean sky subtracted value of pixels within the object or clump.
-For clumps, the average river flux is subtracted from the sky subtracted mean.
+@noindent
+The best @mymath{\theta} (@mymath{\theta_0}, where major axis lies along the
@mymath{x_\theta} axis) can be found by:
-@item --std
-The standard deviation of the pixels within the object or clump.
-For clumps, the river pixels are not subtracted because that is a constant
(per pixel) value and should not affect the standard deviation.
+@dispmath{\left.{\partial \overline{x_\theta^2} \over \partial
\theta}\right|_{\theta_0}=0}
+Taking the derivative, we get:
+@dispmath{2\cos\theta_0\sin\theta_0(\overline{y^2}-\overline{x^2}) +
+2(\cos^2\theta_0-\sin^2\theta_0)\overline{xy}=0} When
+@mymath{\overline{x^2}\neq\overline{y^2}}, we can write:
+@dispmath{\tan2\theta_0 =
+2{\overline{xy} \over \overline{x^2}-\overline{y^2}}.}
-@item --median
-The median sky subtracted value of pixels within the object or clump.
-For clumps, the average river flux is subtracted from the sky subtracted
median.
+@cindex Position angle
+@noindent
+MakeCatalog uses the standard C math library's @code{atan2} function to
estimate @mymath{\theta_0}, which we define as the position angle of the
ellipse.
+To recall, this is the angle of the major axis of the ellipse with the
@mymath{x} axis.
+By definition, when the elliptical profile is rotated by @mymath{\theta_0},
then @mymath{\overline{xy_{\theta_0}}=0}, @mymath{\overline{x_{\theta_0}^2}}
will be the extent of the maximum variance and
@mymath{\overline{y_{\theta_0}^2}} the extent of the minimum variance (which
are perpendicular for an ellipse).
+Replacing @mymath{\theta_0} in the equations above for
@mymath{\overline{x_\theta}} and @mymath{\overline{y_\theta}}, we can get the
semi-major (@mymath{A}) and semi-minor (@mymath{B}) lengths:
-@item --maximum
-The maximum value of pixels within the object or clump.
-When the label (object or clump) is larger than three pixels, the maximum is
actually derived by the mean of the brightest three pixels, not the largest
pixel value of the same label.
-This is because noise fluctuations can be very strong in the extreme values of
the objects/clumps due to Poisson noise (which gets stronger as the mean gets
higher).
-Simply using the maximum pixel value will create a strong scatter in results
that depend on the maximum (for example, the @option{--fwhm} option also uses
this value internally).
+@dispmath{A^2\equiv\overline{x_{\theta_0}^2}= {\overline{x^2} +
+\overline{y^2} \over 2} + \sqrt{\left({\overline{x^2}-\overline{y^2} \over
2}\right)^2 + \overline{xy}^2}}
-@item --sigclip-number
-The number of elements/pixels in the dataset after sigma-clipping the object
or clump.
-The sigma-clipping parameters can be set with the @option{--sigmaclip} option
described in @ref{MakeCatalog inputs and basic settings}.
-For more on Sigma-clipping, see @ref{Sigma clipping}.
+@dispmath{B^2\equiv\overline{y_{\theta_0}^2}= {\overline{x^2} +
+\overline{y^2} \over 2} - \sqrt{\left({\overline{x^2}-\overline{y^2} \over
2}\right)^2 + \overline{xy}^2}}
-@item --sigclip-median
-The sigma-clipped median value of the object of clump's pixel distribution.
-For more on sigma-clipping and how to define it, see @option{--sigclip-number}.
+As a summary, it is important to remember that the units of @mymath{A} and
@mymath{B} are in pixels (the standard deviation of a positional distribution)
and that they represent the spatial light distribution of the object in both
image dimensions (rotated by @mymath{\theta_0}).
+When the object cannot be represented as an ellipse, this interpretation
breaks down: @mymath{\overline{xy_{\theta_0}}\neq0} and
@mymath{\overline{y_{\theta_0}^2}} will not be the direction of minimum
variance.
-@item --sigclip-mean
-The sigma-clipped mean value of the object of clump's pixel distribution.
-For more on sigma-clipping and how to define it, see @option{--sigclip-number}.
-@item --sigclip-std
-The sigma-clipped standard deviation of the object of clump's pixel
distribution.
-For more on sigma-clipping and how to define it, see @option{--sigclip-number}.
-@item -m
-@itemx --magnitude
-The magnitude of clumps or objects, see @option{--sum}.
-@item --magnitude-error
-The magnitude error of clumps or objects.
-The magnitude error is calculated from the signal-to-noise ratio (see
@option{--sn} and @ref{Quantifying measurement limits}).
-Note that until now this error assumes uncorrelated pixel values and also does
not include the error in estimating the aperture (or error in generating the
labeled image).
-For now these factors have to be found by other means.
-@url{https://savannah.gnu.org/task/index.php?14124, Task 14124} has been
defined for work on adding these sources of error too.
-The returned value will be NaN when the label covers only NaN pixels in the
values image, or a pixel is NaN in the @option{--instd} image, but non-NaN in
the values image.
-The latter situation usually happens when there is a bug in the previous steps
of your analysis, and is important because those pixels with a NaN in the
@option{--instd} image may contribute significantly to the final error.
-If you want to ignore those pixels in the error measurement, set them to zero
(which is a meaningful number in such scenarios).
+@node Adding new columns to MakeCatalog, MakeCatalog measurements, Measuring
elliptical parameters, MakeCatalog
+@subsection Adding new columns to MakeCatalog
-@item --clumps-magnitude
-[Objects] The magnitude of all clumps in this object, see
@option{--clumps-sum}.
+MakeCatalog is designed to allow easy addition of different measurements over
a labeled image (see @url{https://arxiv.org/abs/1611.06387v1, Akhlaghi [2016]}).
+A check-list style description of necessary steps to do that is described in
this section.
+The common development characteristics of MakeCatalog and other Gnuastro
programs is explained in @ref{Developing}.
+We strongly encourage you to have a look at that chapter to greatly simplify
your navigation in the code.
+After adding and testing your column, you are most welcome (and encouraged) to
share it with us so we can add to the next release of Gnuastro for everyone
else to also benefit from your efforts.
-@item --upperlimit
-The upper limit value (in units of the input image) for this object or clump.
-This is the sigma-clipped standard deviation of the random distribution,
multiplied by the value of @option{--upnsigma}).
-See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
-This is very important for the fainter and smaller objects in the image where
the measured magnitudes are not reliable.
+MakeCatalog will first pass over each label's pixels two times and do
necessary raw/internal calculations.
+Once the passes are done, it will use the raw information for filling the
final catalog's columns.
+In the first pass it will gather mainly object information and in the second
run, it will mainly focus on the clumps, or any other measurement that needs an
output from the first pass.
+These two passes are designed to be raw summations: no extra processing.
+This will allow parallel processing and simplicity/clarity.
+So if your new calculation, needs new raw information from the pixels, then
you will need to also modify the respective @code{mkcatalog_first_pass} and
@code{mkcatalog_second_pass} functions (both in
@file{bin/mkcatalog/mkcatalog.c}) and define new raw table columns in
@file{main.h} (hopefully the comments in the code are clear enough).
-@item --upperlimit-mag
-The upper limit magnitude for this object or clump.
-See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
-This is very important for the fainter and smaller objects in the image where
the measured magnitudes are not reliable.
+In all these different places, the final columns are sorted in the same order
(same order as @ref{Invoking astmkcatalog}).
+This allows a particular column/option to be easily found in all steps.
+Therefore in adding your new option, be sure to keep it in the same relative
place in the list in all the separate places (it does not necessarily have to
be in the end), and near conceptually similar options.
-@item --upperlimit-onesigma
-The @mymath{1\sigma} upper limit value (in units of the input image) for this
object or clump.
-See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
-When @option{--upnsigma=1}, this column's values will be the same as
@option{--upperlimit}.
+@table @file
-@item --upperlimit-sigma
-The position of the label's sum measured within the distribution of randomly
placed upperlimit measurements in units of the distribution's @mymath{\sigma}
or standard deviation.
-See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
+@item main.h
+The @code{objectcols} and @code{clumpcols} enumerated variables (@code{enum})
define the raw/internal calculation columns.
+If your new column requires new raw calculations, add a row to the respective
list.
+If your calculation requires any other settings parameters, you should add a
variable to the @code{mkcatalogparams} structure.
-@item --upperlimit-quantile
-The position of the label's sum within the distribution of randomly placed
upperlimit measurements as a quantile (value between 0 or 1).
-See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
-If the object is brighter than the brightest randomly placed profile, a value
of @code{inf} is returned.
-If it is less than the minimum, a value of @code{-inf} is reported.
+@item ui.c
+If the new column needs raw calculations (an entry was added in
@code{objectcols} and @code{clumpcols}), specify which inputs it needs in
@code{ui_necessary_inputs}, similar to the other options.
+Afterwards, if your column includes any particular settings (you needed to add
a variable to the @code{mkcatalogparams} structure in @file{main.h}), you
should do the sanity checks and preparations for it here.
-@item --upperlimit-skew
-@cindex Skewness
-This column contains the non-parametric skew of the @mymath{\sigma}-clipped
random distribution that was used to estimate the upper-limit magnitude.
-Taking @mymath{\mu} as the mean, @mymath{\nu} as the median and
@mymath{\sigma} as the standard deviation, the traditional definition of
skewness is defined as: @mymath{(\mu-\nu)/\sigma}.
+@item ui.h
+The @code{option_keys_enum} associates a unique value for each option to
MakeCatalog.
+The options that have a short option version, the single character short
comment is used for the value.
+Those that do not have a short option version, get a large integer
automatically.
+You should add a variable here to identify your desired column.
-This can be a good measure to see how much you can trust the random
measurements, or in other words, how accurately the regions with signal have
been masked/detected. If the skewness is strong (and to the positive), then you
can tell that you have a lot of undetected signal in the dataset, and therefore
that the upper-limit measurement (and other measurements) are not reliable.
-@item --river-mean
-[Clumps] The average of the river pixel values around this clump.
-River pixels were defined in Akhlaghi and Ichikawa 2015.
-In short they are the pixels immediately outside of the clumps.
-This value is used internally to find the sum (or magnitude) and signal to
noise ratio of the clumps.
-It can generally also be used as a scale to gauge the base (ambient) flux
surrounding the clump.
-In case there was no river pixels, then this column will have the value of the
Sky under the clump.
-So note that this value is @emph{not} sky subtracted.
+@cindex GNU C library
+@item args.h
+This file specifies all the parameters for the GNU C library, Argp structure
that is in charge of reading the user's options.
+To define your new column, just copy an existing set of parameters and change
the first, second and 5th values (the only ones that differ between all the
columns), you should use the macro you defined in @file{ui.h} here.
-@item --river-num
-[Clumps] The number of river pixels around this clump, see
-@option{--river-mean}.
-@item --sn
-The Signal to noise ratio (S/N) of all clumps or objects.
-See Akhlaghi and Ichikawa (2015) for the exact equations used.
+@item columns.c
+This file contains the main definition and high-level calculation of your new
column through the @code{columns_define_alloc} and @code{columns_fill}
functions.
+In the first, you specify the basic information about the column: its name,
units, comments, type (see @ref{Numeric data types}) and how it should be
printed if the output is a text file.
+You should also specify the raw/internal columns that are necessary for this
column here as the many existing examples show.
+Through the types for objects and rows, you can specify if this column is only
for clumps, objects or both.
-The returned value will be NaN when the label covers only NaN pixels in the
values image, or a pixel is NaN in the @option{--instd} image, but non-NaN in
the values image.
-The latter situation usually happens when there is a bug in the previous steps
of your analysis, and is important because those pixels with a NaN in the
@option{--instd} image may contribute significantly to the final error.
-If you want to ignore those pixels in the error measurement, set them to zero
(which is a meaningful number).
+The second main function (@code{columns_fill}) writes the final value into the
appropriate column for each object and clump.
+As you can see in the many existing examples, you can define your processing
on the raw/internal calculations here and save them in the output.
-@item --sky
-The sky flux (per pixel) value under this object or clump.
-This is actually the mean value of all the pixels in the sky image that lie on
the same position as the object or clump.
+@item mkcatalog.c
+This file contains the low-level parsing functions.
+To be optimized, the parsing is done in parallel through the
@code{mkcatalog_single_object} function.
+This function initializes the necessary arrays and calls the lower-level
@code{parse_objects} and @code{parse_clumps} for actually going over the pixels.
+They are all heavily commented, so you should be able to follow where to add
your necessary low-level calculations.
+
+@item doc/gnuastro.texi
+Update this manual and add a description for the new column.
-@item --sky-std
-The sky value standard deviation (per pixel) for this clump or object.
-This is the square root of the mean variance under the object, or the root
mean square.
@end table
-@node Surface brightness measurements, Morphology measurements nonparametric,
Brightness measurements, MakeCatalog measurements
-@subsubsection Surface brightness measurements
-In astronomy, Surface brightness is most commonly measured in units of
magnitudes per arcsec@mymath{^2} (for the formal definition, see
@ref{Brightness flux magnitude}).
-Therefore it involves both the values of the pixels within each input label
(or output row) and their position.
-@table @option
-@item --sb
-The surface brightness (in units of mag/arcsec@mymath{^2}) of the labeled
region (objects or clumps).
-For more on the definition of the surface brightness, see @ref{Brightness flux
magnitude}.
-@item --sb-error
-Error in measuring the surface brightness (the @option{--sb} column).
-This column will use the value given to @option{--spatialresolution} in the
processing (in pixels).
-For more on @option{--spatialresolution}, see @ref{MakeCatalog inputs and
basic settings} and for the equation used to derive the surface brightness
error, see @ref{Surface brightness error of each detection}.
+@node MakeCatalog measurements, Invoking astmkcatalog, Adding new columns to
MakeCatalog, MakeCatalog
+@subsection MakeCatalog measurements
-@item --upperlimit-sb
-The upper-limit surface brightness (in units of mag/arcsec@mymath{^2}) of this
labeled region (object or clump).
-In other words, this option measures the surface brightness of noise within
the footprint of each input label.
+MakeCatalog's output measurements/columns can be specified using command-line
options (@ref{Options}).
+The current measurements in MakeCatalog are those which only produce one final
value for each label (for example, its magnitude: a single number).
+All the different label's measurements can be written as one column in a final
table/catalog that contains other columns for other similar single-number
measurements.
-This is just a simple wrapper over lower-level columns: setting B and A as the
value in the columns @option{--upperlimit} and @option{--area-arcsec2}, we fill
this column by simply use the surface brightness equation of @ref{Brightness
flux magnitude}.
+In this case, all the different label's measurements can be written as one
column in a final table/catalog that contains other columns for other similar
single-number measurements.
+The majority of this section is devoted to MakeCatalog's single-valued
measurements.
+However, MakeCatalog can also do measurements that produce more than one value
for each label.
+Currently the only such measurement is generation of spectra from 3D cubes
with the @option{--spectrum} option and it is discussed in the end of this
section.
-@item --half-sum-sb
-Surface brightness (in units of mag/arcsec@mymath{^2}) within the area that
contains half the total sum of the label's pixels (object or clump).
-This is useful as a measure of the sharpness of an astronomical object: for
example a star will have very few pixels at half the maximum, so its
@option{--half-sum-sb} will be much brighter than a galaxy at the same
magnitude.
-Also consider @option{--half-max-sb} below.
+Command-line options are used to identify which measurements you want in the
final catalog(s) and in what order.
+If any of the options below is called on the command-line or in any of the
configuration files, it will be included as a column in the output catalog.
+The order of the columns is in the same order as the options were seen by
MakeCatalog (see @ref{Configuration file precedence}).
+Some of the columns apply to both ``objects'' and ``clumps'' and some are
particular to only one of them (for the definition of ``objects'' and
``clumps'', see @ref{Segment}).
+Columns/options that are unique to one catalog (only objects, or only clumps),
are explicitly marked with [Objects] or [Clumps] to specify the catalog they
will be placed in.
-This column just plugs in the values of half the value of the @option{--sum}
column and the @option{--half-sum-area} column, into the surface brightness
equation.
-Therefore please see the description in @option{--half-sum-area} to understand
the systematics of this column and potential biases (see @ref{Morphology
measurements nonparametric}).
+@menu
+* Identifier columns:: Identifying labels of each row (object/clumps).
+* Position measurements in pixels:: Containing image/pixel (X/Y) measurements.
+* Position measurements in WCS:: Containing WCS (for example RA/Dec)
measurements.
+* Brightness measurements:: Using pixel values of each label.
+* Surface brightness measurements:: Various ways to measure surface
brightness.
+* Morphology measurements nonparametric:: Non-parametric morphology.
+* Morphology measurements elliptical:: Elliptical morphology measurements.
+* Measurements per slice spectra:: Measurements on each slice (like spectra).
+@end menu
-@item --half-max-sb
-The surface brightness (in units of mag/arcsec@mymath{^2}) within the region
that contains half the maximum value of the labeled region.
-Like @option{--half-sum-sb} this option this is a good way to identify the
``central'' surface brightness of an object.
-To know when this measurement is reasonable, see the description of
@option{--fwhm} in @ref{Morphology measurements nonparametric}.
+@node Identifier columns, Position measurements in pixels, MakeCatalog
measurements, MakeCatalog measurements
+@subsubsection Identifier columns
-@item --sigclip-mean-sb
-Surface brightness (over 1 pixel's area in arcsec@mymath{^2}) of the
sigma-clipped mean value of the pixel values distribution associated to each
label (object or clump).
-This is useful in scenarios where your labels have approximately
@emph{constant} surface brightness values @emph{after} after removing outliers:
for example in a radial profile, see @ref{Invoking astscript-radial-profile}).
+The identifier of each row (group of measurements) is usually the first thing
you will be requesting from MakeCatalog.
+Without the identifier, it is not clear which measurement corresponds to which
label for the input.
-In other scenarios it should be used with extreme care.
-For example over the full area of a galaxy/star the pixel distribution is not
constant (or symmetric after adding noise), their pixel distributions are
inherently skewed (with fewer pixels in the center, having a very large value
and many pixels in the outer parts having lower values).
-Therefore, sigma-clipping is not meaningful for such objects!
-For more on the definition of the surface brightness, see @ref{Brightness flux
magnitude}, for more on sigma-clipping, see @ref{Sigma clipping}.
+Since MakeCatalog can also optionally take sub-structure label (clumps; see
@ref{Segment}), there are various identifiers in general that are listed below.
+The most generic (and shortest and easiest to type!) is the @option{--ids}
option which can be used in object-only or object-clump catalogs.
-The error in this magnitude can be retrieved from the
@option{--sigclip-mean-sb-delta} column described below, and you can use the
@option{--sigclip-std-sb} column to find when the magnitude has become
noise-dominated (signal-to-noise ratio is roughly 1).
-See the description of these two options for more.
+@table @option
+@item --i
+@itemx --ids
+This is a unique option which can add multiple columns to the final catalog(s).
+Calling this option will put the object IDs (@option{--obj-id}) in the objects
catalog and host-object-ID (@option{--host-obj-id}) and ID-in-host-object
(@option{--id-in-host-obj}) into the clumps catalog.
+Hence if only object catalogs are required, it has the same effect as
@option{--obj-id}.
-@item --sigclip-mean-sb-delta
-Scatter in the @option{--sigclip-mean-sb} without using the standard deviation
of each pixel (that is given by @option{--instd} in @ref{MakeCatalog inputs and
basic settings}).
-The scatter here is measured from the values of the label themselves.
-This measurement is therefore most meaningful when you expect the flux across
one label to be constant (as in a radial profile for example).
+@item --obj-id
+[Objects] ID of this object.
-This is calculated using the equation in @ref{Surface brightness error of each
detection}, where @mymath{\Delta{A}=0} (since sigma-clip is calculated per
pixel and there is no error in a single pixel).
-Within the equation to derive @mymath{\Delta{M}} (the error in magnitude,
derived in @ref{Magnitude measurement error of each detection}), the
signal-to-noise ratio is defined by dividing the sigma-clipped mean by the
sigma-clipped standard deviation.
+@item -j
+@itemx --host-obj-id
+[Clumps] The ID of the object which hosts this clump.
-@item --sigclip-std-sb
-The surface brightness of the sigma-clipped standard deviation of all the
pixels with the same label.
-For labels that are expected to have the same value in all their pixels (for
example each annulus of a radial profile) this can be used to find the reliable
(@mymath{1\sigma}) surface brightness for that label.
-In other words, if @option{--sigclip-mean-sb} is fainter than the value of
this column, you know that noise is becoming significant.
-However, as described in @option{--sigclip-mean-sb}, the sigma-clipped
measurements of MakeCatalog should only be used in certain situations like
radial profiles, see the description there for more.
+@item --id-in-host-obj
+[Clumps] The ID of this clump in its host object.
@end table
-@node Morphology measurements nonparametric, Morphology measurements
elliptical, Surface brightness measurements, MakeCatalog measurements
-@subsubsection Morphology measurements (non-parametric)
-
-Morphology defined as a way to quantify the ``shape'' of an object in your
input image.
-This includes both the position and value of the pixels within your input
labels.
-There are many ways to define the morphology of an object.
-In this section, we will review the available non-parametric measures of
morphology.
-By non-parametric, we mean that no functional shape is assumed for the
measurement.
+@node Position measurements in pixels, Position measurements in WCS,
Identifier columns, MakeCatalog measurements
+@subsubsection Position measurements in pixels
-In @ref{Morphology measurements elliptical} you can see some parametric
elliptical measurements (which are only valid when the object is actually an
ellipse).
+The position of a labeled region within your input dataset (in its own units)
can be measured with the options in this section.
+By ``in its own units'' we mean pixels in a 2D image or voxels in a 3D cube.
+For example if the flux-weighted center of a label lies 123 pixels on the
horizontal and 456 pixels on the vertical, the @option{--x} and @option{--y}
options will put a value of 123 and 456 in their respective columns.
+As you see below, there are various ways to define the ``position'' of an
object, so read the differences carefully to choose the one that corresponds
best to your usage.
@table @option
-@item --num-clumps
-[Objects] The number of clumps in this object.
-
-@item --area
-The raw area (number of pixels/voxels) in any clump or object independent of
what pixel it lies over (if it is NaN/blank or unused for example).
+@item -x
+@itemx --x
+The flux weighted center of all objects and clumps along the first FITS axis
(horizontal when viewed in SAO DS9), see @mymath{\overline{x}} in
@ref{Measuring elliptical parameters}.
+The weight has to have a positive value (pixel value larger than the Sky
value) to be meaningful! Specially when doing matched photometry, this might
not happen: no pixel value might be above the Sky value.
+For such detections, the geometric center will be reported in this column (see
@option{--geo-x}).
+You can use @option{--weight-area} to see which was used.
-@item --arcsec2-area
-The used (non-blank in values image) area of the labeled region in units of
arc-seconds squared.
-This column is just the value of the @option{--area} column, multiplied by the
area of each pixel in the input image (in units of arcsec^2).
-Similar to the @option{--ra} or @option{--dec} columns, for this option to
work, the objects extension used has to have a WCS structure.
+@item -y
+@itemx --y
+The flux weighted center of all objects and clumps along the second FITS axis
(vertical when viewed in SAO DS9).
+See @option{--x}.
-@item --area-min-val
-The number of pixels that are equal to the minimum value of the labeled region
(clump or object).
+@item -z
+@itemx --z
+The flux weighted center of all objects and clumps along the third FITS
+axis. See @option{--x}.
-@item --area-max-val
-The number of pixels that are equal to the maximum value of the labeled region
(clump or object).
+@item --geo-x
+The geometric center of all objects and clumps along the first FITS axis axis.
+The geometric center is the average pixel positions irrespective of their
pixel values.
-@item --area-xy
-@cindex IFU: Integral Field Unit
-@cindex Integral Field Unit
-Similar to @option{--area}, when the clump or object is projected onto the
first two dimensions.
-This is only available for 3-dimensional datasets.
-When working with Integral Field Unit (IFU) datasets, this projection onto the
first two dimensions would be a narrow-band image.
+@item --geo-y
+The geometric center of all objects and clumps along the second FITS axis
axis, see @option{--geo-x}.
-@item --fwhm
-@cindex FWHM
-The full width at half maximum (in units of pixels, along the semi-major axis)
of the labeled region (object or clump).
-The maximum value is estimated from the mean of the top-three pixels with the
highest values, see the description under @option{--maximum}.
-The number of pixels that have half the value of that maximum are then found
(value in the @option{--half-max-area} column) and a radius is estimated from
the area.
-See the description under @option{--half-sum-radius} for more on converting
area to radius along major axis.
+@item --geo-z
+The geometric center of all objects and clumps along the third FITS axis axis,
see @option{--geo-x}.
-Because of its non-parametric nature, this column is most reliable on clumps
and should only be used in objects with great caution.
-This is because objects can have more than one clump (peak with true signal)
and multiple peaks are not treated separately in objects, so the result of this
column will be biased.
+@item --min-val-x
+Position of pixel with minimum value in objects and clumps, along the first
FITS axis.
-Also, because of its non-parametric nature, this FWHM it does not account for
the PSF, and it will be strongly affected by noise if the object is
faint/diffuse
-So when half the maximum value (which can be requested using the
@option{--maximum} column) is too close to the local noise level (which can be
requested using the @option{--sky-std} column), the value returned in this
column is meaningless (its just noise peaks which are randomly distributed over
the area).
-You can therefore use the @option{--maximum} and @option{--sky-std} columns to
remove, or flag, unreliable FWHMs.
-For example, if a labeled region's maximum is less than 2 times the sky
standard deviation, the value will certainly be unreliable (half of that is
@mymath{1\sigma}!).
-For a more reliable value, this fraction should be around 4 (so half the
maximum is 2@mymath{\sigma}).
+@item --max-val-x
+Position of pixel with maximum value in objects and clumps, along the first
FITS axis.
-@item --half-max-area
-The number of pixels with values larger than half the maximum flux within the
labeled region.
-This option is used to estimate @option{--fwhm}, so please read the notes
there for the caveats and necessary precautions.
+@item --min-val-y
+Position of pixel with minimum value in objects and clumps, along the first
FITS axis.
-@item --half-max-radius
-The radius of region containing half the maximum flux within the labeled
region.
-This is just half the value reported by @option{--fwhm}.
+@item --max-val-y
+Position of pixel with maximum value in objects and clumps, along the first
FITS axis.
-@item --half-max-sum
-The sum of the pixel values containing half the maximum flux within the
labeled region (or those that are counted in @option{--halfmaxarea}).
-This option uses the pixels within @option{--fwhm}, so please read the notes
there for the caveats and necessary precautions.
+@item --min-val-z
+Position of pixel with minimum value in objects and clumps, along the first
FITS axis.
-@item --half-sum-area
-The number of pixels that contain half the object or clump's total sum of
pixels (half the value in the @option{--sum} column).
-To count this area, all the non-blank values associated with the given label
(object or clump) will be sorted and summed in order (starting from the
maximum), until the sum becomes larger than half the total sum of the label's
pixels.
+@item --max-val-z
+Position of pixel with maximum value in objects and clumps, along the first
FITS axis.
-This option is thus good for clumps (which are defined to have a single peak
in their morphology), but for objects you should be careful: if the object
includes multiple peaks/clumps at roughly the same level, then the area
reported by this option will be distributed over all the peaks.
+@item --min-x
+The minimum position of all objects and clumps along the first FITS axis.
-@item --half-sum-radius
-Radius (in units of pixels) derived from the area that contains half the total
sum of the label's pixels (value reported by @option{--halfsumarea}).
-If the area is @mymath{A_h} and the axis ratio is @mymath{q}, then the value
returned in this column is @mymath{\sqrt{A_h/({\pi}q)}}.
-This option is a good measure of the concentration of the @emph{observed}
(after PSF convolution and noisy) object or clump,
-But as described below it underestimates the effective radius.
-Also, it should be used in caution with objects that may have multiple clumps.
-It is most reliable with clumps or objects that have one or zero clumps, see
the note under @option{--halfsumarea}.
+@item --max-x
+The maximum position of all objects and clumps along the first FITS axis.
-@cindex Ellipse area
-@cindex Area, ellipse
-Recall that in general, for an ellipse with semi-major axis @mymath{a},
semi-minor axis @mymath{b}, and axis ratio @mymath{q=b/a} the area (@mymath{A})
is @mymath{A={\pi}ab={\pi}qa^2}.
-For a circle (where @mymath{q=1}), this simplifies to the familiar
@mymath{A={\pi}a^2}.
+@item --min-y
+The minimum position of all objects and clumps along the second FITS axis.
-@cindex S@'ersic profile
-@cindex Effective radius
-This option should not be confused with the @emph{effective radius} for
S@'ersic profiles, commonly written as @mymath{r_e}.
-For more on the S@'ersic profile and @mymath{r_e}, please see @ref{Galaxies}.
-Therefore, when @mymath{r_e} is meaningful for the target (the target is
elliptically symmetric and can be parameterized as a S@'ersic profile),
@mymath{r_e} should be derived from fitting the profile with a S@'ersic
function which has been convolved with the PSF.
-But from the equation above, you see that this radius is derived from the raw
image's labeled values (after convolution, with no parametric profile), so this
column's value will generally be (much) smaller than @mymath{r_e}, depending on
the PSF, depth of the dataset, the morphology, or if a fraction of the profile
falls on the edge of the image.
+@item --max-y
+The maximum position of all objects and clumps along the second FITS axis.
-In other words, this option can only be interpreted as an effective radius if
there is no noise and no PSF and the profile within the image extends to
infinity (or a very large multiple of the effective radius) and it not near the
edge of the image.
+@item --min-z
+The minimum position of all objects and clumps along the third FITS axis.
-@item --frac-max1-area
-@itemx --frac-max2-area
-Number of pixels brighter than the given fraction(s) of the maximum pixel
value.
-For the maximum value, see the description of @option{--maximum} column.
-The fraction(s) are given through the @option{--frac-max} option (that can
take two values) and is described in @ref{MakeCatalog inputs and basic
settings}.
-Recall that in @option{--halfmaxarea}, the fraction is fixed to 0.5.
-Hence, added with these two columns, you can sample three parts of the profile
area.
+@item --max-z
+The maximum position of all objects and clumps along the third FITS axis.
-@item --frac-max1-sum
-@itemx --frac-max2-sum
-Sum of pixels brighter than the given fraction(s) of the maximum pixel value.
-For the maximum value, see the description of @option{--maximum} column below.
-The fraction(s) are given through the @option{--frac-max} option (that can
take two values) and is described in @ref{MakeCatalog inputs and basic
settings}.
-Recall that in @option{--halfmaxsum}, the fraction is fixed to 0.5.
-Hence, added with these two columns, you can sample three parts of the
profile's sum of pixels.
+@item --clumps-x
+[Objects] The flux weighted center of all the clumps in this object along the
first FITS axis.
+See @option{--x}.
-@item --frac-max1-radius
-@itemx --frac-max2-radius
-Radius (in units of pixels) derived from the area that contains the given
fractions of the maximum valued pixel(s) of the label's pixels (value reported
by @option{--frac-max1-area} or @option{--frac-max2-area}).
-For the maximum value, see the description of @option{--maximum} column below.
-The fractions are given through the @option{--frac-max} option (that can take
two values) and is described in @ref{MakeCatalog inputs and basic settings}.
-Recall that in @option{--fwhm}, the fraction is fixed to 0.5.
-Hence, added with these two columns, you can sample three parts of the
profile's radius.
+@item --clumps-y
+[Objects] The flux weighted center of all the clumps in this object along the
second FITS axis.
+See @option{--x}.
-@item --clumps-area
-[Objects] The total area of all the clumps in this object.
+@item --clumps-z
+[Objects] The flux weighted center of all the clumps in this object along the
third FITS axis.
+See @option{--x}.
-@item --weight-area
-The area (number of pixels) used in the flux weighted position calculations.
+@item --clumps-geo-x
+[Objects] The geometric center of all the clumps in this object along the
first FITS axis.
+See @option{--geo-x}.
-@item --geo-area
-The area of all the pixels labeled with an object or clump.
-Note that unlike @option{--area}, pixel values are completely ignored in this
column.
-For example, if a pixel value is blank, it will not be counted in
@option{--area}, but will be counted here.
+@item --clumps-geo-y
+[Objects] The geometric center of all the clumps in this object along the
second FITS axis.
+See @option{--geo-x}.
-@item --geo-area-xy
-Similar to @option{--geo-area}, when the clump or object is projected onto the
first two dimensions.
-This is only available for 3-dimensional datasets.
-When working with Integral Field Unit (IFU) datasets, this projection onto the
first two dimensions would be a narrow-band image.
+@item --clumps-geo-z
+[Objects] The geometric center of all the clumps in this object along
+the third FITS axis. See @option{--geo-z}.
@end table
-@node Morphology measurements elliptical, Measurements per slice spectra,
Morphology measurements nonparametric, MakeCatalog measurements
-@subsubsection Morphology measurements (elliptical)
+@node Position measurements in WCS, Brightness measurements, Position
measurements in pixels, MakeCatalog measurements
+@subsubsection Position measurements in WCS
-When your target objects are sufficiently ellipse-like, you can use the
measurements below to quantify the various parameters of the ellipse.
-For details of how the elliptical parameters are measured, see @ref{Measuring
elliptical parameters}.
-For non-parametric morphological measurements, see @ref{Morphology
measurements nonparametric}.
-The measures that start with @option{--geo-*} ignore the pixel values and just
do the measurements on the label's ``geometric'' shape.
+The position of a labeled region within your input dataset (in the World
Coordinate System, or WCS) can be measured with the options in this section.
+As you see below, there are various ways to define the ``position'' of an
object, so read the differences carefully to choose the one that corresponds
best to your usage.
-@table @option
-@item --semi-major
-The pixel-value weighted root mean square (RMS) along the semi-major axis of
the profile (assuming it is an ellipse) in units of pixels.
+The most common WCS coordinates are Right Ascension (RA) and Declination in an
equatorial system.
+Therefore, to simplify their usage, we have special @option{--ra} and
@option{--dec} options.
+However, the WCS of datasets are in Galactic coordinates, so to be generic,
you can use the @option{--w1}, @option{--w2} or @option{--w3} (if you have a 3D
cube) options.
+In case your dataset's WCS is not in your desired system (for example it is
Galactic, but you want equatorial 2000), you can use the @option{--wcscoordsys}
option of Gnuastro's Fits program on the labeled image before running
MakeCatalog (see @ref{Keyword inspection and manipulation}).
-@item --semi-minor
-The pixel-value weighted root mean square (RMS) along the semi-minor axis of
the profile (assuming it is an ellipse) in units of pixels.
+@table @option
+@item -r
+@itemx --ra
+Flux weighted right ascension of all objects or clumps, see @option{--x}.
+This is just an alias for one of the lower-level @option{--w1} or
@option{--w2} options.
+Using the FITS WCS keywords (@code{CTYPE}), MakeCatalog will determine which
axis corresponds to the right ascension.
+If no @code{CTYPE} keywords start with @code{RA}, an error will be printed
when requesting this column and MakeCatalog will abort.
-@item --axis-ratio
-The pixel-value weighted axis ratio (semi-minor/semi-major) of the object or
clump.
+@item -d
+@itemx --dec
+Flux weighted declination of all objects or clumps, see @option{--x}.
+This is just an alias for one of the lower-level @option{--w1} or
@option{--w2} options.
+Using the FITS WCS keywords (@code{CTYPE}), MakeCatalog will determine which
axis corresponds to the declination.
+If no @code{CTYPE} keywords start with @code{DEC}, an error will be printed
when requesting this column and MakeCatalog will abort.
-@item --position-angle
-The pixel-value weighted angle of the semi-major axis with the first FITS axis
in degrees.
+@item --w1
+Flux weighted first WCS axis of all objects or clumps, see @option{--x}.
+The first WCS axis is commonly used as right ascension in images.
-@item --geo-semi-major
-The geometric (ignoring pixel values) root mean square (RMS) along the
semi-major axis of the profile, assuming it is an ellipse, in units of pixels.
+@item --w2
+Flux weighted second WCS axis of all objects or clumps, see @option{--x}.
+The second WCS axis is commonly used as declination in images.
-@item --geo-semi-minor
-The geometric (ignoring pixel values) root mean square (RMS) along the
semi-minor axis of the profile, assuming it is an ellipse, in units of pixels.
+@item --w3
+Flux weighted third WCS axis of all objects or clumps, see
+@option{--x}. The third WCS axis is commonly used as wavelength in integral
+field unit data cubes.
-@item --geo-axis-ratio
-The geometric (ignoring pixel values) axis ratio of the profile, assuming it
is an ellipse.
+@item --geo-w1
+Geometric center in first WCS axis of all objects or clumps, see
@option{--geo-x}.
+The first WCS axis is commonly used as right ascension in images.
-@item --geo-position-angle
-The geometric (ignoring pixel values) angle of the semi-major axis with the
first FITS axis in degrees.
-@end table
+@item --geo-w2
+Geometric center in second WCS axis of all objects or clumps, see
@option{--geo-x}.
+The second WCS axis is commonly used as declination in images.
-@node Measurements per slice spectra, , Morphology measurements elliptical,
MakeCatalog measurements
-@subsubsection Measurements per slice (spectra)
+@item --geo-w3
+Geometric center in third WCS axis of all objects or clumps, see
+@option{--geo-x}. The third WCS axis is commonly used as wavelength in
+integral field unit data cubes.
-@cindex Spectrum
-@cindex 3D data-cubes
-@cindex Cubes (3D data)
-@cindex IFU: Integral Field Unit
-@cindex Integral field unit (IFU)
-@cindex Spectrum (of astronomical source)
-When the input is a 3D data cube, MakeCatalog has the following multi-valued
measurements per label.
-For a tutorial on how to use these options and interpret their values, see
@ref{Detecting lines and extracting spectra in 3D data}.
+@item --clumps-w1
+[Objects] Flux weighted center in first WCS axis of all clumps in this object,
see @option{--x}.
+The first WCS axis is commonly used as right ascension in images.
-These options will do measurements on each 2D slice of the input 3D cube;
hence the common the format of @code{--*-in-slice}.
-Each slice usually corresponds to a certain wavelength, you can also think of
these measurements as spectra.
+@item --clumps-w2
+[Objects] Flux weighted declination of all clumps in this object, see
@option{--x}.
+The second WCS axis is commonly used as declination in images.
-For each row (input label), each of the columns described here will contain
multiple values as a vector column.
-The number of measurements in each column is the number of slices in the cube,
or the size of the cube along the third dimension.
-To learn more about vector columns and how to manipulate them, see @ref{Vector
columns}.
-For example usage of these columns in the tutorial above, see @ref{3D
measurements and spectra} and @ref{Extracting a single spectrum and plotting
it}.
+@item --clumps-w3
+[Objects] Flux weighted center in third WCS axis of all clumps in this object,
see @option{--x}.
+The third WCS axis is commonly used as wavelength in integral field unit data
cubes.
-@noindent
-There are two ways to do each measurement on a slice for each label:
-@table @asis
-@item Only label
-The measurement will only be done on the voxels in the slice that are
associated to that label.
-These types of per-slice measurement therefore have the following properties:
-@itemize
-@item
-This will only be a measurement of that label and will not be affected by any
other label.
-@item
-The number of voxels used in each slice can be different (usually only one or
two voxels at the two extremes of the label (along the third dimension), and
many in the middle.
-@item
-Since most labels are localized along the third dimension (maybe only covering
20 slices out of thousands!), many of the measurements (on slices where the
label doesn't exist) will be NaN (for the sum measurements for example) or 0
(for the area measurements).
-@end itemize
-@item Projected label
-MakeCatalog will first project the 3D label into a 2D surface (along the third
dimension) to get its 2D footprint.
-Afterwards, all the voxels in that 2D footprint will be measured all slices.
-All these measurements will have a @option{-proj-} component in their name.
-These types of per-slice measurement therefore has the following properties:
+@item --clumps-geo-w1
+[Objects] Geometric center right ascension of all clumps in this object, see
@option{--geo-x}.
+The first WCS axis is commonly used as right ascension in images.
-@itemize
-@item
-A measurement will be done on each slice of the cube.
-@item
-All measurements will be done on the same surface area.
-@item
-Labels can overlap when they are projected onto the first two FITS dimensions
(the spatial coordinates, not spectral).
-As a result, other emission lines or objects may contaminate the resulting
spectrum for each label.
-@end itemize
+@item --clumps-geo-w2
+[Objects] Geometric center declination of all clumps in this object, see
@option{--geo-x}.
+The second WCS axis is commonly used as declination in images.
-To help separate other labels, MakeCatalog can do a third type of measurement
on each slice: measurements on the voxels that belong to other labels but
overlap with the 2D projection.
-This can be used to see how much your projected measurement is affected by
other emission sources (on the projected spectra) and also if multiple lines
(labeled regions) belong to the same physical object.
-These measurements contain @code{-other-} in their name.
+@item --clumps-geo-w3
+[Objects] Geometric center in third WCS axis of all clumps in this object, see
@option{--geo-x}.
+The third WCS axis is commonly used as wavelength in integral field unit data
cubes.
@end table
+@node Brightness measurements, Surface brightness measurements, Position
measurements in WCS, MakeCatalog measurements
+@subsubsection Brightness measurements
+
+Within an image, pixels have both a position and a value.
+In the sections above all the measurements involved position (see
@ref{Position measurements in pixels} or @ref{Position measurements in WCS}).
+The measurements in this section only deal with pixel values and ignore the
pixel positions completely.
+In other words, for the options of this section each labeled region within the
input is just a group of values (and their associated error values if
necessary), and they let you do various types of measurements on the resulting
distribution of values.
+
@table @option
+@item --sum
+The sum of all pixel values associated to this label (object or clump).
+Note that if a sky value or image has been given, it will be subtracted before
any column measurement.
+For clumps, the ambient values (average of river pixels around the clump,
multiplied by the area of the clump) is subtracted, see @option{--river-mean}.
+So the sum of all the clump-sums in the clump catalog of one object will be
smaller than the @option{--clumps-sum} column of the objects catalog.
-@item --sum-in-slice
-[Only label] Sum of values in each slice.
+If no usable pixels are present over the clump or object (for example, they
are all blank), the returned value will be NaN (note that zero is meaningful).
-@item --sum-err-in-slice
-[Only label] Error in '--sum-in-slice'.
+@item --sum-error
+The (@mymath{1\sigma}) error in measuring the sum of values of a label
(objects or clumps).
-@item --area-in-slice
-[Only label] Number of labeled in each slice.
+The returned value will be NaN when the label covers only NaN pixels in the
values image, or a pixel is NaN in the @option{--instd} image, but non-NaN in
the values image.
+The latter situation usually happens when there is a bug in the previous steps
of your analysis, and is important because those pixels with a NaN in the
@option{--instd} image may contribute significantly to the final error.
+If you want to ignore those pixels in the error measurement, set them to zero
(which is a meaningful number in such scenarios).
-@item --sum-proj-in-slice
-[Projected label] Sum of projected area in each slice.
+@item --clumps-sum
+[Objects] The total sum of the pixels covered by clumps (before subtracting
the river) within each object.
+This is simply the sum of @option{--sum-no-river} in the clumps catalog (see
below).
+If no usable pixels are present over the clump or object (for example, they
are all blank), the stored value will be NaN (note that zero is meaningful).
-@item --area-proj-in-slice:
-[Projected label] Number of voxels that are used in
@option{--sum-proj-in-slice}.
+@item --sum-no-river
+[Clumps] The sum of Sky (not river) subtracted clump pixel values.
+By definition, for the clumps, the average value of the rivers surrounding it
are subtracted from it for a first order accounting for contamination by
neighbors.
-@item --sum-proj-err-in-slice
-[Projected label] Error of @option{--sum-proj-in-slice}.
+If no usable pixels are present over the clump or object (for example, they
are all blank), the stored value will be NaN (note that zero is meaningful).
-@item --area-other-in-slice
-[Projected label] Area of other label in projected area on each slice.
+@item --mean
+The mean sky subtracted value of pixels within the object or clump.
+For clumps, the average river flux is subtracted from the sky subtracted mean.
-@item --sum-other-in-slice
-[Projected label] Sum of other label in projected area on each slice.
+@item --std
+The standard deviation of the pixels within the object or clump.
+For clumps, the river pixels are not subtracted because that is a constant
(per pixel) value and should not affect the standard deviation.
-@item --sum-other-err-in-slice:
-[Projected label] Area in @option{--sum-other-in-slice}.
-@end table
+@item --median
+The median sky subtracted value of pixels within the object or clump.
+For clumps, the average river flux is subtracted from the sky subtracted
median.
+@item --maximum
+The maximum value of pixels within the object or clump.
+When the label (object or clump) is larger than three pixels, the maximum is
actually derived by the mean of the brightest three pixels, not the largest
pixel value of the same label.
+This is because noise fluctuations can be very strong in the extreme values of
the objects/clumps due to Poisson noise (which gets stronger as the mean gets
higher).
+Simply using the maximum pixel value will create a strong scatter in results
that depend on the maximum (for example, the @option{--fwhm} option also uses
this value internally).
+@item --sigclip-number
+The number of elements/pixels in the dataset after sigma-clipping the object
or clump.
+The sigma-clipping parameters can be set with the @option{--sigmaclip} option
described in @ref{MakeCatalog inputs and basic settings}.
+For more on Sigma-clipping, see @ref{Sigma clipping}.
+@item --sigclip-median
+The sigma-clipped median value of the object of clump's pixel distribution.
+For more on sigma-clipping and how to define it, see @option{--sigclip-number}.
+@item --sigclip-mean
+The sigma-clipped mean value of the object of clump's pixel distribution.
+For more on sigma-clipping and how to define it, see @option{--sigclip-number}.
-@node Invoking astmkcatalog, , MakeCatalog measurements, MakeCatalog
-@subsection Invoking MakeCatalog
+@item --sigclip-std
+The sigma-clipped standard deviation of the object of clump's pixel
distribution.
+For more on sigma-clipping and how to define it, see @option{--sigclip-number}.
-MakeCatalog will do measurements and produce a catalog from a labeled dataset
and optional values dataset(s).
-The executable name is @file{astmkcatalog} with the following general template
+@item -m
+@itemx --magnitude
+The magnitude of clumps or objects, see @option{--sum}.
-@example
-$ astmkcatalog [OPTION ...] InputImage.fits
-@end example
+@item --magnitude-error
+The magnitude error of clumps or objects.
+The magnitude error is calculated from the signal-to-noise ratio (see
@option{--sn} and @ref{Quantifying measurement limits}).
+Note that until now this error assumes uncorrelated pixel values and also does
not include the error in estimating the aperture (or error in generating the
labeled image).
-@noindent
-One line examples:
+For now these factors have to be found by other means.
+@url{https://savannah.gnu.org/task/index.php?14124, Task 14124} has been
defined for work on adding these sources of error too.
-@example
-## Create catalog with RA, Dec, Magnitude and Magnitude error,
-## from Segment's output:
-$ astmkcatalog --ra --dec --magnitude seg-out.fits
+The returned value will be NaN when the label covers only NaN pixels in the
values image, or a pixel is NaN in the @option{--instd} image, but non-NaN in
the values image.
+The latter situation usually happens when there is a bug in the previous steps
of your analysis, and is important because those pixels with a NaN in the
@option{--instd} image may contribute significantly to the final error.
+If you want to ignore those pixels in the error measurement, set them to zero
(which is a meaningful number in such scenarios).
-## Same catalog as above (using short options):
-$ astmkcatalog -rdm seg-out.fits
+@item --clumps-magnitude
+[Objects] The magnitude of all clumps in this object, see
@option{--clumps-sum}.
-## Write the catalog to a text table:
-$ astmkcatalog -rdm seg-out.fits --output=cat.txt
+@item --upperlimit
+The upper limit value (in units of the input image) for this object or clump.
+This is the sigma-clipped standard deviation of the random distribution,
multiplied by the value of @option{--upnsigma}).
+See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
+This is very important for the fainter and smaller objects in the image where
the measured magnitudes are not reliable.
-## Output columns specified in `columns.conf':
-$ astmkcatalog --config=columns.conf seg-out.fits
+@item --upperlimit-mag
+The upper limit magnitude for this object or clump.
+See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
+This is very important for the fainter and smaller objects in the image where
the measured magnitudes are not reliable.
-## Use object and clump labels from a K-band image, but pixel values
-## from an i-band image.
-$ astmkcatalog K_segmented.fits --hdu=DETECTIONS --clumpscat \
- --clumpsfile=K_segmented.fits --clumpshdu=CLUMPS \
- --valuesfile=i_band.fits
-@end example
+@item --upperlimit-onesigma
+The @mymath{1\sigma} upper limit value (in units of the input image) for this
object or clump.
+See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
+When @option{--upnsigma=1}, this column's values will be the same as
@option{--upperlimit}.
-@cindex Gaussian
-@noindent
-If MakeCatalog is to do processing (not printing help or option values), an
input labeled image should be provided.
-The options described in this section are those that are particular to
MakeProfiles.
-For operations that MakeProfiles shares with other programs (mainly involving
input/output or general processing steps), see @ref{Common options}.
-Also see @ref{Common program behavior} for some general characteristics of all
Gnuastro programs including MakeCatalog.
+@item --upperlimit-sigma
+The position of the label's sum measured within the distribution of randomly
placed upperlimit measurements in units of the distribution's @mymath{\sigma}
or standard deviation.
+See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
+
+@item --upperlimit-quantile
+The position of the label's sum within the distribution of randomly placed
upperlimit measurements as a quantile (value between 0 or 1).
+See @ref{Quantifying measurement limits} and @ref{Upper-limit settings} for a
complete explanation.
+If the object is brighter than the brightest randomly placed profile, a value
of @code{inf} is returned.
+If it is less than the minimum, a value of @code{-inf} is reported.
-The various measurements/columns of MakeCatalog are requested as options,
either on the command-line or in configuration files, see @ref{Configuration
files}.
-The full list of available columns is available in @ref{MakeCatalog
measurements}.
-Depending on the requested columns, MakeCatalog needs more than one input
dataset, for more details, please see @ref{MakeCatalog inputs and basic
settings}.
-The upper-limit measurements in particular need several configuration options
which are thoroughly discussed in @ref{Upper-limit settings}.
-Finally, in @ref{MakeCatalog output} the output file(s) created by MakeCatalog
are discussed.
+@item --upperlimit-skew
+@cindex Skewness
+This column contains the non-parametric skew of the @mymath{\sigma}-clipped
random distribution that was used to estimate the upper-limit magnitude.
+Taking @mymath{\mu} as the mean, @mymath{\nu} as the median and
@mymath{\sigma} as the standard deviation, the traditional definition of
skewness is defined as: @mymath{(\mu-\nu)/\sigma}.
-@menu
-* MakeCatalog inputs and basic settings:: Input files and basic settings.
-* Upper-limit settings:: Settings for upper-limit measurements.
-* MakeCatalog output:: File names of MakeCatalog's output table.
-@end menu
+This can be a good measure to see how much you can trust the random
measurements, or in other words, how accurately the regions with signal have
been masked/detected. If the skewness is strong (and to the positive), then you
can tell that you have a lot of undetected signal in the dataset, and therefore
that the upper-limit measurement (and other measurements) are not reliable.
-@node MakeCatalog inputs and basic settings, Upper-limit settings, Invoking
astmkcatalog, Invoking astmkcatalog
-@subsubsection MakeCatalog inputs and basic settings
+@item --river-mean
+[Clumps] The average of the river pixel values around this clump.
+River pixels were defined in Akhlaghi and Ichikawa 2015.
+In short they are the pixels immediately outside of the clumps.
+This value is used internally to find the sum (or magnitude) and signal to
noise ratio of the clumps.
+It can generally also be used as a scale to gauge the base (ambient) flux
surrounding the clump.
+In case there was no river pixels, then this column will have the value of the
Sky under the clump.
+So note that this value is @emph{not} sky subtracted.
-MakeCatalog works by using a localized/labeled dataset (see @ref{MakeCatalog}).
-This dataset maps/labels pixels to a specific target (row number in the final
catalog) and is thus the only necessary input dataset to produce a minimal
catalog in any situation.
-Because it only has labels/counters, it must have an integer type (see
@ref{Numeric data types}), see below if your labels are in a floating point
container.
-When the requested measurements only need this dataset (for example,
@option{--geo-x}, @option{--geo-y}, or @option{--geo-area}), MakeCatalog will
not read any more datasets.
+@item --river-num
+[Clumps] The number of river pixels around this clump, see
+@option{--river-mean}.
-Low-level measurements that only use the labeled image are rarely sufficient
for any high-level science case.
-Therefore necessary input datasets depend on the requested columns in each run.
-For example, let's assume you want the brightness/magnitude and
signal-to-noise ratio of your labeled regions.
-For these columns, you will also need to provide an extra dataset containing
values for every pixel of the labeled input (to measure magnitude) and another
for the Sky standard deviation (to measure error).
-All such auxiliary input files have to have the same size (number of pixels in
each dimension) as the input labeled image.
-Their numeric data type is irrelevant (they will be converted to 32-bit
floating point internally).
-For the full list of available measurements, see @ref{MakeCatalog
measurements}.
+@item --sn
+The Signal to noise ratio (S/N) of all clumps or objects.
+See Akhlaghi and Ichikawa (2015) for the exact equations used.
-The ``values'' dataset is used for measurements like brightness/magnitude, or
flux-weighted positions.
-If it is a real image, by default it is assumed to be already Sky-subtracted
prior to running MakeCatalog.
-If it is not, you use the @option{--subtractsky} option to, so MakeCatalog
reads and subtracts the Sky dataset before any processing.
-To obtain the Sky value, you can use the @option{--sky} option of
@ref{Statistics}, but the best recommended method is @ref{NoiseChisel}, see
@ref{Sky value}.
+The returned value will be NaN when the label covers only NaN pixels in the
values image, or a pixel is NaN in the @option{--instd} image, but non-NaN in
the values image.
+The latter situation usually happens when there is a bug in the previous steps
of your analysis, and is important because those pixels with a NaN in the
@option{--instd} image may contribute significantly to the final error.
+If you want to ignore those pixels in the error measurement, set them to zero
(which is a meaningful number).
-MakeCatalog can also do measurements on sub-structures of detections.
-In other words, it can produce two catalogs.
-Following the nomenclature of Segment (see @ref{Segment}), the main labeled
input dataset is known as ``object'' labels and the (optional) sub-structure
input dataset is known as ``clumps''.
-If MakeCatalog is run with the @option{--clumpscat} option, it will also need
a labeled image containing clumps, similar to what Segment produces (see
@ref{Segment output}).
-Since clumps are defined within detected regions (they exist over signal, not
noise), MakeCatalog uses their boundaries to subtract the level of signal under
them.
+@item --sky
+The sky flux (per pixel) value under this object or clump.
+This is actually the mean value of all the pixels in the sky image that lie on
the same position as the object or clump.
-There are separate options to explicitly request a file name and HDU/extension
for each of the required input datasets as fully described below (with the
@option{--*file} format).
-When each dataset is in a separate file, these options are necessary.
-However, one great advantage of the FITS file format (that is heavily used in
astronomy) is that it allows the storage of multiple datasets in one file.
-So in most situations (for example, if you are using the outputs of
@ref{NoiseChisel} or @ref{Segment}), all the necessary input datasets can be in
one file.
+@item --sky-std
+The sky value standard deviation (per pixel) for this clump or object.
+This is the square root of the mean variance under the object, or the root
mean square.
+@end table
-When none of the @option{--*file} options are given, MakeCatalog will assume
the necessary input datasets are in the file given as its argument (without any
option).
-When the Sky or Sky standard deviation datasets are necessary and the only
@option{--*file} option called is @option{--valuesfile}, MakeCatalog will
search for these datasets (with the default/given HDUs) in the file given to
@option{--valuesfile} (before looking into the main argument file).
+@node Surface brightness measurements, Morphology measurements nonparametric,
Brightness measurements, MakeCatalog measurements
+@subsubsection Surface brightness measurements
-When the clumps image (necessary with the @option{--clumpscat} option) is
used, MakeCatalog looks into the (possibly existing) @code{NUMLABS} keyword for
the total number of clumps in the image (irrespective of how many objects there
are).
-If it is not present, it will count them and possibly re-label the clumps so
the clump labels always start with 1 and finish with the total number of clumps
in each object.
-The re-labeled clumps image will be stored with the @file{-clumps-relab.fits}
suffix.
-This can slightly slow-down the run.
+In astronomy, Surface brightness is most commonly measured in units of
magnitudes per arcsec@mymath{^2} (for the formal definition, see
@ref{Brightness flux magnitude}).
+Therefore it involves both the values of the pixels within each input label
(or output row) and their position.
-Note that @code{NUMLABS} is automatically written by Segment in its outputs,
so if you are feeding Segment's clump labels, you can benefit from the improved
speed.
-Otherwise, if you are creating the clumps label dataset manually, it may be
good to include the @code{NUMLABS} keyword in its header and also be sure that
there is no gap in the clump labels.
-For example, if an object has three clumps, they are labeled as 1, 2, 3.
-If they are labeled as 1, 3, 4, or any other combination of three positive
integers that are not an increment of the previous, you might get unknown
behavior.
+@table @option
+@item --sb
+The surface brightness (in units of mag/arcsec@mymath{^2}) of the labeled
region (objects or clumps).
+For more on the definition of the surface brightness, see @ref{Brightness flux
magnitude}.
-It may happen that your labeled objects image was created with a program that
only outputs floating point files.
-However, you know it only has integer valued pixels that are stored in a
floating point container.
-In such cases, you can use Gnuastro's Arithmetic program (see
@ref{Arithmetic}) to change the numerical data type of the image
(@file{float.fits}) to an integer type image (@file{int.fits}) with a command
like below:
+@item --sb-error
+Error in measuring the surface brightness (the @option{--sb} column).
+This column will use the value given to @option{--spatialresolution} in the
processing (in pixels).
+For more on @option{--spatialresolution}, see @ref{MakeCatalog inputs and
basic settings} and for the equation used to derive the surface brightness
error, see @ref{Surface brightness error of each detection}.
-@example
-@command{$ astarithmetic float.fits int32 --output=int.fits}
-@end example
+@item --upperlimit-sb
+The upper-limit surface brightness (in units of mag/arcsec@mymath{^2}) of this
labeled region (object or clump).
+In other words, this option measures the surface brightness of noise within
the footprint of each input label.
-To summarize: if the input file to MakeCatalog is the default/full output of
Segment (see @ref{Segment output}) you do not have to worry about any of the
@option{--*file} options below.
-You can just give Segment's output file to MakeCatalog as described in
@ref{Invoking astmkcatalog}.
-To feed NoiseChisel's output into MakeCatalog, just change the labeled
dataset's header (with @option{--hdu=DETECTIONS}).
-The full list of input dataset options and general setting options are
described below.
+This is just a simple wrapper over lower-level columns: setting B and A as the
value in the columns @option{--upperlimit} and @option{--area-arcsec2}, we fill
this column by simply use the surface brightness equation of @ref{Brightness
flux magnitude}.
-@table @option
+@item --half-sum-sb
+Surface brightness (in units of mag/arcsec@mymath{^2}) within the area that
contains half the total sum of the label's pixels (object or clump).
+This is useful as a measure of the sharpness of an astronomical object: for
example a star will have very few pixels at half the maximum, so its
@option{--half-sum-sb} will be much brighter than a galaxy at the same
magnitude.
+Also consider @option{--half-max-sb} below.
-@item -l FITS
-@itemx --clumpsfile=FITS
-The FITS file containing the labeled clumps dataset when @option{--clumpscat}
is called (see @ref{MakeCatalog output}).
-When @option{--clumpscat} is called, but this option is not, MakeCatalog will
look into the main input file (given as an argument) for the required
extension/HDU (value to @option{--clumpshdu}).
+This column just plugs in the values of half the value of the @option{--sum}
column and the @option{--half-sum-area} column, into the surface brightness
equation.
+Therefore please see the description in @option{--half-sum-area} to understand
the systematics of this column and potential biases (see @ref{Morphology
measurements nonparametric}).
-@item --clumpshdu=STR
-The HDU/extension of the clump labels dataset.
-Only pixels with values above zero will be considered.
-The clump labels dataset has to be an integer data type (see @ref{Numeric data
types}) and only pixels with a value larger than zero will be used.
-See @ref{Segment output} for a description of the expected format.
+@item --half-max-sb
+The surface brightness (in units of mag/arcsec@mymath{^2}) within the region
that contains half the maximum value of the labeled region.
+Like @option{--half-sum-sb} this option this is a good way to identify the
``central'' surface brightness of an object.
+To know when this measurement is reasonable, see the description of
@option{--fwhm} in @ref{Morphology measurements nonparametric}.
-@item -v FITS
-@itemx --valuesfile=FITS
-The file name of the (sky-subtracted) values dataset.
-When any of the columns need values to associate with the input labels (for
example, to measure the sum of pixel values or magnitude of a galaxy, see
@ref{Brightness flux magnitude}), MakeCatalog will look into a ``values'' for
the respective pixel values.
-In most common processing, this is the actual astronomical image that the
labels were defined, or detected, over.
-The HDU/extension of this dataset in the given file can be specified with
@option{--valueshdu}.
-If this option is not called, MakeCatalog will look for the given extension in
the main input file.
+@item --sigclip-mean-sb
+Surface brightness (over 1 pixel's area in arcsec@mymath{^2}) of the
sigma-clipped mean value of the pixel values distribution associated to each
label (object or clump).
+This is useful in scenarios where your labels have approximately
@emph{constant} surface brightness values @emph{after} after removing outliers:
for example in a radial profile, see @ref{Invoking astscript-radial-profile}).
-@item --valueshdu=STR/INT
-The name or number (counting from zero) of the extension containing the
``values'' dataset, see the descriptions above and those in
@option{--valuesfile} for more.
+In other scenarios it should be used with extreme care.
+For example over the full area of a galaxy/star the pixel distribution is not
constant (or symmetric after adding noise), their pixel distributions are
inherently skewed (with fewer pixels in the center, having a very large value
and many pixels in the outer parts having lower values).
+Therefore, sigma-clipping is not meaningful for such objects!
+For more on the definition of the surface brightness, see @ref{Brightness flux
magnitude}, for more on sigma-clipping, see @ref{Sigma clipping}.
-@item -s FITS/FLT
-@itemx --insky=FITS/FLT
-Sky value as a single number, or the file name containing a dataset (different
values per pixel or tile).
-The Sky dataset is only necessary when @option{--subtractsky} is called or
when a column directly related to the Sky value is requested (currently
@option{--sky}).
-This dataset may be a tessellation, with one element per tile (see
@option{--oneelempertile} of NoiseChisel's @ref{Processing options}).
+The error in this magnitude can be retrieved from the
@option{--sigclip-mean-sb-delta} column described below, and you can use the
@option{--sigclip-std-sb} column to find when the magnitude has become
noise-dominated (signal-to-noise ratio is roughly 1).
+See the description of these two options for more.
-When the Sky dataset is necessary but this option is not called, MakeCatalog
will assume it is an HDU/extension (specified by @option{--skyhdu}) in one of
the already given files.
-First it will look for it in the @option{--valuesfile} (if it is given) and
then the main input file (given as an argument).
+@item --sigclip-mean-sb-delta
+Scatter in the @option{--sigclip-mean-sb} without using the standard deviation
of each pixel (that is given by @option{--instd} in @ref{MakeCatalog inputs and
basic settings}).
+The scatter here is measured from the values of the label themselves.
+This measurement is therefore most meaningful when you expect the flux across
one label to be constant (as in a radial profile for example).
-By default the values dataset is assumed to be already Sky subtracted, so
-this dataset is not necessary for many of the columns.
+This is calculated using the equation in @ref{Surface brightness error of each
detection}, where @mymath{\Delta{A}=0} (since sigma-clip is calculated per
pixel and there is no error in a single pixel).
+Within the equation to derive @mymath{\Delta{M}} (the error in magnitude,
derived in @ref{Magnitude measurement error of each detection}), the
signal-to-noise ratio is defined by dividing the sigma-clipped mean by the
sigma-clipped standard deviation.
-@item --skyhdu=STR
-HDU/extension of the Sky dataset, see @option{--skyfile}.
+@item --sigclip-std-sb
+The surface brightness of the sigma-clipped standard deviation of all the
pixels with the same label.
+For labels that are expected to have the same value in all their pixels (for
example each annulus of a radial profile) this can be used to find the reliable
(@mymath{1\sigma}) surface brightness for that label.
+In other words, if @option{--sigclip-mean-sb} is fainter than the value of
this column, you know that noise is becoming significant.
+However, as described in @option{--sigclip-mean-sb}, the sigma-clipped
measurements of MakeCatalog should only be used in certain situations like
radial profiles, see the description there for more.
+@end table
-@item --subtractsky
-Subtract the sky value or dataset from the values file prior to any
-processing.
+@node Morphology measurements nonparametric, Morphology measurements
elliptical, Surface brightness measurements, MakeCatalog measurements
+@subsubsection Morphology measurements (non-parametric)
-@item -t STR/FLT
-@itemx --instd=STR/FLT
-Sky standard deviation value as a single number, or the file name containing a
dataset (different values per pixel or tile).
-With the @option{--variance} option you can tell MakeCatalog to interpret this
value/dataset as a variance image, not standard deviation.
+Morphology defined as a way to quantify the ``shape'' of an object in your
input image.
+This includes both the position and value of the pixels within your input
labels.
+There are many ways to define the morphology of an object.
+In this section, we will review the available non-parametric measures of
morphology.
+By non-parametric, we mean that no functional shape is assumed for the
measurement.
-@strong{Important note:} This must only be the SKY standard deviation or
variance (not including the signal's contribution to the error).
-In other words, the final standard deviation of a pixel depends on how much
signal there is in it.
-MakeCatalog will find the amount of signal within each pixel (while
subtracting the Sky, if @option{--subtractsky} is called) and account for the
extra error due to it's value (signal).
-Therefore if the input standard deviation (or variance) image also contains
the contribution of signal to the error, then the final error measurements will
be over-estimated.
+In @ref{Morphology measurements elliptical} you can see some parametric
elliptical measurements (which are only valid when the object is actually an
ellipse).
-@item --stdhdu=STR
-The HDU of the Sky value standard deviation image.
+@table @option
+@item --num-clumps
+[Objects] The number of clumps in this object.
-@item --variance
-The dataset given to @option{--instd} (and @option{--stdhdu} has the Sky
variance of every pixel, not the Sky standard deviation.
+@item --area
+The raw area (number of pixels/voxels) in any clump or object independent of
what pixel it lies over (if it is NaN/blank or unused for example).
-@item --forcereadstd
-Read the input STD image even if it is not required by any of the requested
columns.
-This is because some of the output catalog's metadata may need it, for
example, to calculate the dataset's surface brightness limit (see
@ref{Quantifying measurement limits}, configured with @option{--sfmagarea} and
@option{--sfmagnsigma} in @ref{MakeCatalog output}).
+@item --arcsec2-area
+The used (non-blank in values image) area of the labeled region in units of
arc-seconds squared.
+This column is just the value of the @option{--area} column, multiplied by the
area of each pixel in the input image (in units of arcsec^2).
+Similar to the @option{--ra} or @option{--dec} columns, for this option to
work, the objects extension used has to have a WCS structure.
-Furthermore, if the input STD image does not have the @code{MEDSTD} keyword
(that is meant to contain the representative standard deviation of the full
image), with this option, the median will be calculated and used for the
surface brightness limit.
+@item --area-min-val
+The number of pixels that are equal to the minimum value of the labeled region
(clump or object).
-@item -z FLT
-@itemx --zeropoint=FLT
-The zero point magnitude for the input image, see @ref{Brightness flux
magnitude}.
+@item --area-max-val
+The number of pixels that are equal to the maximum value of the labeled region
(clump or object).
-@item --sigmaclip FLT,FLT
-The sigma-clipping parameters when any of the sigma-clipping related columns
are requested (for example, @option{--sigclip-median} or
@option{--sigclip-number}).
+@item --area-xy
+@cindex IFU: Integral Field Unit
+@cindex Integral Field Unit
+Similar to @option{--area}, when the clump or object is projected onto the
first two dimensions.
+This is only available for 3-dimensional datasets.
+When working with Integral Field Unit (IFU) datasets, this projection onto the
first two dimensions would be a narrow-band image.
-This option takes two values: the first is the multiple of @mymath{\sigma},
and the second is the termination criteria.
-If the latter is larger than 1, it is read as an integer number and will be
the number of times to clip.
-If it is smaller than 1, it is interpreted as the tolerance level to stop
clipping.
-See @ref{Sigma clipping} for a complete explanation.
+@item --fwhm
+@cindex FWHM
+The full width at half maximum (in units of pixels, along the semi-major axis)
of the labeled region (object or clump).
+The maximum value is estimated from the mean of the top-three pixels with the
highest values, see the description under @option{--maximum}.
+The number of pixels that have half the value of that maximum are then found
(value in the @option{--half-max-area} column) and a radius is estimated from
the area.
+See the description under @option{--half-sum-radius} for more on converting
area to radius along major axis.
-@item --frac-max=FLT[,FLT]
-The fractions (one or two) of maximum value in objects or clumps to be used in
the related columns, for example, @option{--frac-max1-area},
@option{--frac-max1-sum} or @option{--frac-max1-radius}, see @ref{MakeCatalog
measurements}.
-For the maximum value, see the description of @option{--maximum} column below.
-The value(s) of this option must be larger than 0 and smaller than 1 (they are
a fraction).
-When only @option{--frac-max1-area} or @option{--frac-max1-sum} is requested,
one value must be given to this option, but if @option{--frac-max2-area} or
@option{--frac-max2-sum} are also requested, two values must be given to this
option.
-The values can be written as simple floating point numbers, or as fractions,
for example, @code{0.25,0.75} and @code{0.25,3/4} are the same.
+Because of its non-parametric nature, this column is most reliable on clumps
and should only be used in objects with great caution.
+This is because objects can have more than one clump (peak with true signal)
and multiple peaks are not treated separately in objects, so the result of this
column will be biased.
-@item --spatialresolution=FLT
-The error in measuring spatial properties (for example, the area) in units of
pixels.
-You can think of this as the FWHM of the dataset's PSF and is used in
measurements like the error in surface brightness (@option{--sb-error}, see
@ref{MakeCatalog measurements}).
-Ideally, images are taken in the optimal Nyquist sampling @ref{Sampling
theorem}, so the default value for this option is 2.
-But in practice real images my be over-sampled (usually ground-based images,
where you will need to increase the default value) or undersampled (some
space-based images, where you will need to decrease the default value).
+Also, because of its non-parametric nature, this FWHM it does not account for
the PSF, and it will be strongly affected by noise if the object is
faint/diffuse
+So when half the maximum value (which can be requested using the
@option{--maximum} column) is too close to the local noise level (which can be
requested using the @option{--sky-std} column), the value returned in this
column is meaningless (its just noise peaks which are randomly distributed over
the area).
+You can therefore use the @option{--maximum} and @option{--sky-std} columns to
remove, or flag, unreliable FWHMs.
+For example, if a labeled region's maximum is less than 2 times the sky
standard deviation, the value will certainly be unreliable (half of that is
@mymath{1\sigma}!).
+For a more reliable value, this fraction should be around 4 (so half the
maximum is 2@mymath{\sigma}).
-@item --inbetweenints
-Output will contain one row for all integers between 1 and the largest label
in the input (irrespective of their existance in the input image).
-By default, MakeCatalog's output will only contain rows with integers that
actually corresponded to at least one pixel in the input dataset.
+@item --half-max-area
+The number of pixels with values larger than half the maximum flux within the
labeled region.
+This option is used to estimate @option{--fwhm}, so please read the notes
there for the caveats and necessary precautions.
-For example, if the input's only labeled pixel values are 11 and 13,
MakeCatalog's default output will only have two rows.
-If you use this option, it will have 13 rows and all the columns corresponding
to integer identifiers that did not correspond to any pixel will be 0 or NaN
(depending on context).
-@end table
+@item --half-max-radius
+The radius of region containing half the maximum flux within the labeled
region.
+This is just half the value reported by @option{--fwhm}.
+@item --half-max-sum
+The sum of the pixel values containing half the maximum flux within the
labeled region (or those that are counted in @option{--halfmaxarea}).
+This option uses the pixels within @option{--fwhm}, so please read the notes
there for the caveats and necessary precautions.
+@item --half-sum-area
+The number of pixels that contain half the object or clump's total sum of
pixels (half the value in the @option{--sum} column).
+To count this area, all the non-blank values associated with the given label
(object or clump) will be sorted and summed in order (starting from the
maximum), until the sum becomes larger than half the total sum of the label's
pixels.
+This option is thus good for clumps (which are defined to have a single peak
in their morphology), but for objects you should be careful: if the object
includes multiple peaks/clumps at roughly the same level, then the area
reported by this option will be distributed over all the peaks.
+@item --half-sum-radius
+Radius (in units of pixels) derived from the area that contains half the total
sum of the label's pixels (value reported by @option{--halfsumarea}).
+If the area is @mymath{A_h} and the axis ratio is @mymath{q}, then the value
returned in this column is @mymath{\sqrt{A_h/({\pi}q)}}.
+This option is a good measure of the concentration of the @emph{observed}
(after PSF convolution and noisy) object or clump,
+But as described below it underestimates the effective radius.
+Also, it should be used in caution with objects that may have multiple clumps.
+It is most reliable with clumps or objects that have one or zero clumps, see
the note under @option{--halfsumarea}.
-@node Upper-limit settings, MakeCatalog output, MakeCatalog inputs and basic
settings, Invoking astmkcatalog
-@subsubsection Upper-limit settings
+@cindex Ellipse area
+@cindex Area, ellipse
+Recall that in general, for an ellipse with semi-major axis @mymath{a},
semi-minor axis @mymath{b}, and axis ratio @mymath{q=b/a} the area (@mymath{A})
is @mymath{A={\pi}ab={\pi}qa^2}.
+For a circle (where @mymath{q=1}), this simplifies to the familiar
@mymath{A={\pi}a^2}.
-The upper-limit magnitude was discussed in @ref{Quantifying measurement
limits}.
-Unlike other measured values/columns in MakeCatalog, the upper limit magnitude
needs several extra parameters which are discussed here.
-All the options specific to the upper-limit measurements start with
@option{up} for ``upper-limit''.
-The only exception is @option{--envseed} that is also present in other
programs and is general for any job requiring random number generation in
Gnuastro (see @ref{Generating random numbers}).
+@cindex S@'ersic profile
+@cindex Effective radius
+This option should not be confused with the @emph{effective radius} for
S@'ersic profiles, commonly written as @mymath{r_e}.
+For more on the S@'ersic profile and @mymath{r_e}, please see @ref{Galaxies}.
+Therefore, when @mymath{r_e} is meaningful for the target (the target is
elliptically symmetric and can be parameterized as a S@'ersic profile),
@mymath{r_e} should be derived from fitting the profile with a S@'ersic
function which has been convolved with the PSF.
+But from the equation above, you see that this radius is derived from the raw
image's labeled values (after convolution, with no parametric profile), so this
column's value will generally be (much) smaller than @mymath{r_e}, depending on
the PSF, depth of the dataset, the morphology, or if a fraction of the profile
falls on the edge of the image.
-@cindex Reproducibility
-One very important consideration in Gnuastro is reproducibility.
-Therefore, the values to all of these parameters along with others (like the
random number generator type and seed) are also reported in the comments of the
final catalog when the upper limit magnitude column is desired.
-The random seed that is used to define the random positions for each object or
clump is unique and set based on the (optionally) given seed, the total number
of objects and clumps and also the labels of the clumps and objects.
-So with identical inputs, an identical upper-limit magnitude will be found.
-However, even if the seed is identical, when the ordering of the object/clump
labels differs between different runs, the result of upper-limit measurements
will not be identical.
+In other words, this option can only be interpreted as an effective radius if
there is no noise and no PSF and the profile within the image extends to
infinity (or a very large multiple of the effective radius) and it not near the
edge of the image.
-MakeCatalog will randomly place the object/clump footprint over the dataset.
-When the randomly placed footprint does not fall on any object or masked
region (see @option{--upmaskfile}) it will be used in the final distribution.
-Otherwise that particular random position will be ignored and another random
position will be generated.
-Finally, when the distribution has the desired number of successfully measured
random samples (@option{--upnum}) the distribution's properties will be
measured and placed in the catalog.
+@item --frac-max1-area
+@itemx --frac-max2-area
+Number of pixels brighter than the given fraction(s) of the maximum pixel
value.
+For the maximum value, see the description of @option{--maximum} column.
+The fraction(s) are given through the @option{--frac-max} option (that can
take two values) and is described in @ref{MakeCatalog inputs and basic
settings}.
+Recall that in @option{--halfmaxarea}, the fraction is fixed to 0.5.
+Hence, added with these two columns, you can sample three parts of the profile
area.
-When the profile is very large or the image is significantly covered by
detections, it might not be possible to find the desired number of samplings in
a reasonable time.
-MakeProfiles will continue searching until it is unable to find a successful
position (since the last successful measurement@footnote{The counting of failed
positions restarts on every successful measurement.}), for a large multiple of
@option{--upnum} (currently@footnote{In Gnuastro's source, this constant number
is defined as the @code{MKCATALOG_UPPERLIMIT_MAXFAILS_MULTIP} macro in
@file{bin/mkcatalog/main.h}, see @ref{Downloading the source}.} this is 10).
-If @option{--upnum} successful samples cannot be found until this limit is
reached, MakeCatalog will set the upper-limit magnitude for that object to NaN
(blank).
+@item --frac-max1-sum
+@itemx --frac-max2-sum
+Sum of pixels brighter than the given fraction(s) of the maximum pixel value.
+For the maximum value, see the description of @option{--maximum} column below.
+The fraction(s) are given through the @option{--frac-max} option (that can
take two values) and is described in @ref{MakeCatalog inputs and basic
settings}.
+Recall that in @option{--halfmaxsum}, the fraction is fixed to 0.5.
+Hence, added with these two columns, you can sample three parts of the
profile's sum of pixels.
-MakeCatalog will also print a warning if the range of positions available for
the labeled region is smaller than double the size of the region.
-In such cases, the limited range of random positions can artificially decrease
the standard deviation of the final distribution.
-If your dataset can allow it (it is large enough), it is recommended to use a
larger range if you see such warnings.
+@item --frac-max1-radius
+@itemx --frac-max2-radius
+Radius (in units of pixels) derived from the area that contains the given
fractions of the maximum valued pixel(s) of the label's pixels (value reported
by @option{--frac-max1-area} or @option{--frac-max2-area}).
+For the maximum value, see the description of @option{--maximum} column below.
+The fractions are given through the @option{--frac-max} option (that can take
two values) and is described in @ref{MakeCatalog inputs and basic settings}.
+Recall that in @option{--fwhm}, the fraction is fixed to 0.5.
+Hence, added with these two columns, you can sample three parts of the
profile's radius.
-@table @option
+@item --clumps-area
+[Objects] The total area of all the clumps in this object.
-@item --upmaskfile=FITS
-File name of mask image to use for upper-limit calculation.
-In some cases (especially when doing matched photometry), the object labels
specified in the main input and mask image might not be adequate.
-In other words they do not necessarily have to cover @emph{all} detected
objects: the user might have selected only a few of the objects in their
labeled image.
-This option can be used to ignore regions in the image in these situations
when estimating the upper-limit magnitude.
-All the non-zero pixels of the image specified by this option (in the
@option{--upmaskhdu} extension) will be ignored in the upper-limit magnitude
measurements.
+@item --weight-area
+The area (number of pixels) used in the flux weighted position calculations.
-For example, when you are using labels from another image, you can give
NoiseChisel's objects image output for this image as the value to this option.
-In this way, you can be sure that regions with data do not harm your
distribution.
-See @ref{Quantifying measurement limits} for more on the upper limit magnitude.
+@item --geo-area
+The area of all the pixels labeled with an object or clump.
+Note that unlike @option{--area}, pixel values are completely ignored in this
column.
+For example, if a pixel value is blank, it will not be counted in
@option{--area}, but will be counted here.
-@item --upmaskhdu=STR
-The extension in the file specified by @option{--upmask}.
+@item --geo-area-xy
+Similar to @option{--geo-area}, when the clump or object is projected onto the
first two dimensions.
+This is only available for 3-dimensional datasets.
+When working with Integral Field Unit (IFU) datasets, this projection onto the
first two dimensions would be a narrow-band image.
+@end table
-@item --upnum=INT
-The number of random samples to take for all the objects.
-A larger value to this option will give a more accurate result
(asymptotically), but it will also slow down the process.
-When a randomly positioned sample overlaps with a detected/masked pixel it is
not counted and another random position is found until the object completely
lies over an undetected region.
-So you can be sure that for each object, this many samples over undetected
objects are made.
-See the upper limit magnitude discussion in @ref{Quantifying measurement
limits} for more.
+@node Morphology measurements elliptical, Measurements per slice spectra,
Morphology measurements nonparametric, MakeCatalog measurements
+@subsubsection Morphology measurements (elliptical)
-@item --uprange=INT,INT
-The range/width of the region (in pixels) to do random sampling along each
dimension of the input image around each object's position.
-This is not a mandatory option and if not given (or given a value of zero in a
dimension), the full possible range of the dataset along that dimension will be
used.
-This is useful when the noise properties of the dataset vary gradually.
-In such cases, using the full range of the input dataset is going to bias the
result.
-However, note that decreasing the range of available positions too much will
also artificially decrease the standard deviation of the final distribution
(and thus bias the upper-limit measurement).
+When your target objects are sufficiently ellipse-like, you can use the
measurements below to quantify the various parameters of the ellipse.
+For details of how the elliptical parameters are measured, see @ref{Measuring
elliptical parameters}.
+For non-parametric morphological measurements, see @ref{Morphology
measurements nonparametric}.
+The measures that start with @option{--geo-*} ignore the pixel values and just
do the measurements on the label's ``geometric'' shape.
-@item --envseed
-@cindex Seed, Random number generator
-@cindex Random number generator, Seed
-Read the random number generator type and seed value from the environment (see
@ref{Generating random numbers}).
-Random numbers are used in calculating the random positions of different
samples of each object.
+@table @option
+@item --semi-major
+The pixel-value weighted root mean square (RMS) along the semi-major axis of
the profile (assuming it is an ellipse) in units of pixels.
-@item --upsigmaclip=FLT,FLT
-The raw distribution of random values will not be used to find the upper-limit
magnitude, it will first be @mymath{\sigma}-clipped (see @ref{Sigma clipping})
to avoid outliers in the distribution (mainly the faint undetected wings of
bright/large objects in the image).
-This option takes two values: the first is the multiple of @mymath{\sigma},
and the second is the termination criteria.
-If the latter is larger than 1, it is read as an integer number and will be
the number of times to clip.
-If it is smaller than 1, it is interpreted as the tolerance level to stop
clipping. See @ref{Sigma clipping} for a complete explanation.
+@item --semi-minor
+The pixel-value weighted root mean square (RMS) along the semi-minor axis of
the profile (assuming it is an ellipse) in units of pixels.
-@item --upnsigma=FLT
-The multiple of the final (@mymath{\sigma}-clipped) standard deviation (or
@mymath{\sigma}) used to measure the upper-limit sum or magnitude.
+@item --axis-ratio
+The pixel-value weighted axis ratio (semi-minor/semi-major) of the object or
clump.
-@item --checkuplim=INT[,INT]
-Print a table of positions and measured values for all the full random
distribution used for one particular object or clump.
-If only one integer is given to this option, it is interpreted to be an
object's label.
-If two values are given, the first is the object label and the second is the
ID of requested clump within it.
+@item --position-angle
+The pixel-value weighted angle of the semi-major axis with the first FITS axis
in degrees.
-The output is a table with three columns (whether it is FITS or plain-text is
determined with the @option{--tableformat} option, see @ref{Input output
options}).
-The first two columns are the pixel X,Y positions of the center of each
label's tile (see next paragraph), in each random sampling of this particular
object/clump.
-The third column is the measured flux over that region.
-If the region overlapped with a detection or masked pixel, then its measured
value will be a NaN (not-a-number).
-The total number of rows is thus unknown before running.
-However, if an upper-limit measurement was made in the main output of
MakeCatalog, you can be sure that the number of rows with non-NaN measurements
is the number given to the @option{--upnum} option.
+@item --geo-semi-major
+The geometric (ignoring pixel values) root mean square (RMS) along the
semi-major axis of the profile, assuming it is an ellipse, in units of pixels.
-The ``tile'' of each label is defined by the minimum and maximum positions of
each label: values of the @option{--min-x}, @option{--max-x}, @option{--min-y}
and @option{--max-y} columns in the main output table for each label.
-Therefore, the tile center position that is recorded in the output of this
column ignores the distribution of labeled pixels within the tile.
+@item --geo-semi-minor
+The geometric (ignoring pixel values) root mean square (RMS) along the
semi-minor axis of the profile, assuming it is an ellipse, in units of pixels.
-Precise interpretation of the position is only relevant when the footprint of
your label is highly un-symmetrical and you want to use this catalog to insert
your object into the image.
-In such a case, you can also ask for @option{--min-x} and @option{--min-y} and
manually calculate their difference with the following two positional
measurements of your desired label: @option{--geo-x} and @option{--geo-y}
(which report the label's ``geometric'' center; only using the label positions
ignoring any ``values'') or @option{--x} and @option{--y} (which report the
value-weighted center of the label).
-Adding the difference with the position reported by this column, will let you
define alternative ``center''s for your label in particular situations (this
will usually not be necessary!).
-For more on these positional columns, see @ref{Position measurements in
pixels}.
+@item --geo-axis-ratio
+The geometric (ignoring pixel values) axis ratio of the profile, assuming it
is an ellipse.
+
+@item --geo-position-angle
+The geometric (ignoring pixel values) angle of the semi-major axis with the
first FITS axis in degrees.
@end table
+@node Measurements per slice spectra, , Morphology measurements elliptical,
MakeCatalog measurements
+@subsubsection Measurements per slice (spectra)
+
+@cindex Spectrum
+@cindex 3D data-cubes
+@cindex Cubes (3D data)
+@cindex IFU: Integral Field Unit
+@cindex Integral field unit (IFU)
+@cindex Spectrum (of astronomical source)
+When the input is a 3D data cube, MakeCatalog has the following multi-valued
measurements per label.
+For a tutorial on how to use these options and interpret their values, see
@ref{Detecting lines and extracting spectra in 3D data}.
-@node MakeCatalog output, , Upper-limit settings, Invoking astmkcatalog
-@subsubsection MakeCatalog output
-After it has completed all the requested measurements (see @ref{MakeCatalog
measurements}), MakeCatalog will store its measurements in table(s).
-If an output filename is given (see @option{--output} in @ref{Input output
options}), the format of the table will be deduced from the name.
-When it is not given, the input name will be appended with a @file{_cat}
suffix (see @ref{Automatic output}) and its format will be determined from the
@option{--tableformat} option, which is also discussed in @ref{Input output
options}.
-@option{--tableformat} is also necessary when the requested output name is a
FITS table (recall that FITS can accept ASCII and binary tables, see
@ref{Table}).
+These options will do measurements on each 2D slice of the input 3D cube;
hence the common the format of @code{--*-in-slice}.
+Each slice usually corresponds to a certain wavelength, you can also think of
these measurements as spectra.
-By default (when @option{--spectrum} or @option{--clumpscat} are not called)
only a single catalog/table will be created for the labeled ``objects''.
+For each row (input label), each of the columns described here will contain
multiple values as a vector column.
+The number of measurements in each column is the number of slices in the cube,
or the size of the cube along the third dimension.
+To learn more about vector columns and how to manipulate them, see @ref{Vector
columns}.
+For example usage of these columns in the tutorial above, see @ref{3D
measurements and spectra} and @ref{Extracting a single spectrum and plotting
it}.
+@noindent
+There are two ways to do each measurement on a slice for each label:
+@table @asis
+@item Only label
+The measurement will only be done on the voxels in the slice that are
associated to that label.
+These types of per-slice measurement therefore have the following properties:
@itemize
@item
-if @option{--clumpscat} is called, a secondary catalog/table will also be
created for ``clumps'' (one of the outputs of the Segment program, for more on
``objects'' and ``clumps'', see @ref{Segment}).
-In short, if you only have one labeled image, you do not have to worry about
clumps and just ignore this.
+This will only be a measurement of that label and will not be affected by any
other label.
@item
-When @option{--spectrum} is called, it is not mandatory to specify any
single-valued measurement columns. In this case, the output will only be the
spectra of each labeled region within a 3D datacube.
-For more, see the description of @option{--spectrum} in @ref{MakeCatalog
measurements}.
+The number of voxels used in each slice can be different (usually only one or
two voxels at the two extremes of the label (along the third dimension), and
many in the middle.
+@item
+Since most labels are localized along the third dimension (maybe only covering
20 slices out of thousands!), many of the measurements (on slices where the
label doesn't exist) will be NaN (for the sum measurements for example) or 0
(for the area measurements).
@end itemize
+@item Projected label
+MakeCatalog will first project the 3D label into a 2D surface (along the third
dimension) to get its 2D footprint.
+Afterwards, all the voxels in that 2D footprint will be measured all slices.
+All these measurements will have a @option{-proj-} component in their name.
+These types of per-slice measurement therefore has the following properties:
-@cindex Surface brightness limit
-@cindex Limit, Surface brightness
-When possible, MakeCatalog will also measure the full input's noise level
(also known as surface brightness limit, see @ref{Quantifying measurement
limits}).
-Since these measurements are related to the noise and not any particular
labeled object, they are stored as keywords in the output table.
-Furthermore, they are only possible when a standard deviation image has been
loaded (done automatically for any column measurement that involves noise, for
example, @option{--sn}, @option{--magnitude-error} or @option{--sky-std}).
-But if you just want the surface brightness limit and no noise-related column,
you can use @option{--forcereadstd}.
-All these keywords start with @code{SBL} (for ``surface brightness limit'')
and are described below:
+@itemize
+@item
+A measurement will be done on each slice of the cube.
+@item
+All measurements will be done on the same surface area.
+@item
+Labels can overlap when they are projected onto the first two FITS dimensions
(the spatial coordinates, not spectral).
+As a result, other emission lines or objects may contaminate the resulting
spectrum for each label.
+@end itemize
-@table @code
-@item SBLSTD
-Per-pixel standard deviation.
-If a @code{MEDSTD} keyword exists in the standard deviation dataset, then that
value is directly used.
+To help separate other labels, MakeCatalog can do a third type of measurement
on each slice: measurements on the voxels that belong to other labels but
overlap with the 2D projection.
+This can be used to see how much your projected measurement is affected by
other emission sources (on the projected spectra) and also if multiple lines
(labeled regions) belong to the same physical object.
+These measurements contain @code{-other-} in their name.
+@end table
-@item SBLNSIG
-Sigma multiple for surface brightness limit (value you gave to
@option{--sfmagnsigma}), used for @code{SBLMAGPX} and @code{SBLMAG}.
+@table @option
-@item SBLMAGPX
-Per-pixel surface brightness limit (in units of magnitudes/pixel).
+@item --sum-in-slice
+[Only label] Sum of values in each slice.
-@item SBLAREA
-Area (in units of arcsec@mymath{^2}) used in @code{SBLMAG} (value you gave to
@option{--sfmagarea}).
+@item --sum-err-in-slice
+[Only label] Error in '--sum-in-slice'.
-@item SBLMAG
-Surface brightness limit of data calculated over @code{SBLAREA} (in units of
mag/arcsec@mymath{^2}).
-@end table
+@item --area-in-slice
+[Only label] Number of labeled in each slice.
+
+@item --sum-proj-in-slice
+[Projected label] Sum of projected area in each slice.
+
+@item --area-proj-in-slice:
+[Projected label] Number of voxels that are used in
@option{--sum-proj-in-slice}.
+
+@item --sum-proj-err-in-slice
+[Projected label] Error of @option{--sum-proj-in-slice}.
+
+@item --area-other-in-slice
+[Projected label] Area of other label in projected area on each slice.
+
+@item --sum-other-in-slice
+[Projected label] Sum of other label in projected area on each slice.
-When any of the upper-limit measurements are requested, the input parameters
for the upper-limit measurement are stored in the keywords starting with
@code{UP}: @code{UPNSIGMA}, @code{UPNUMBER}, @code{UPRNGNAM}, @code{UPRNGSEE},
@code{UPSCMLTP}, @code{UPSCTOL}.
-These are primarily input arguments, so they correspond to the options with a
similar name.
+@item --sum-other-err-in-slice:
+[Projected label] Area in @option{--sum-other-in-slice}.
+@end table
-The full list of MakeCatalog's options relating to the output file format and
keywords are listed below.
-See @ref{MakeCatalog measurements} for specifying which columns you want in
the final catalog.
-@table @option
-@item -C
-@itemx --clumpscat
-Do measurements on clumps and produce a second catalog (only devoted to
clumps).
-When this option is given, MakeCatalog will also look for a secondary labeled
dataset (identifying substructure) and produce a catalog from that.
-For more on the definition on ``clumps'', see @ref{Segment}.
-When the output is a FITS file, the objects and clumps catalogs/tables will be
stored as multiple extensions of one FITS file.
-You can use @ref{Table} to inspect the column meta-data and contents in this
case.
-However, in plain text format (see @ref{Gnuastro text table format}), it is
only possible to keep one table per file.
-Therefore, if the output is a text file, two output files will be created,
ending in @file{_o.txt} (for objects) and @file{_c.txt} (for clumps).
-@item --noclumpsort
-Do Not sort the clumps catalog based on object ID (only relevant with
@option{--clumpscat}).
-This option will benefit the performance@footnote{The performance boost due to
@option{--noclumpsort} can only be felt when there are a huge number of objects.
-Therefore, by default the output is sorted to avoid miss-understandings or
bugs in the user's scripts when the user forgets to sort the outputs.} of
MakeCatalog when it is run on multiple threads @emph{and} the position of the
rows in the clumps catalog is irrelevant (for example, you just want the
number-counts).
-MakeCatalog does all its measurements on each @emph{object} independently and
in parallel.
-As a result, while it is writing the measurements on each object's clumps, it
does not know how many clumps there were in previous objects.
-Each thread will just fetch the first available row and write the information
of clumps (in order) starting from that row.
-After all the measurements are done, by default (when this option is not
called), MakeCatalog will reorder/permute the clumps catalog to have both the
object and clump ID in an ascending order.
+@node Invoking astmkcatalog, , MakeCatalog measurements, MakeCatalog
+@subsection Invoking MakeCatalog
-If you would like to order the catalog later (when it is a plain text file),
you can run the following command to sort the rows by object ID (and clump ID
within each object), assuming they are respectively the first and second
columns:
+MakeCatalog will do measurements and produce a catalog from a labeled dataset
and optional values dataset(s).
+The executable name is @file{astmkcatalog} with the following general template
@example
-$ awk '!/^#/' out_c.txt | sort -g -k1,1 -k2,2
+$ astmkcatalog [OPTION ...] InputImage.fits
@end example
-@item --sfmagnsigma=FLT
-Value to multiply with the median standard deviation (from a @command{MEDSTD}
keyword in the Sky standard deviation image) for estimating the surface
brightness limit.
-Note that the surface brightness limit is only reported when a standard
deviation image is read, in other words a column using it is requested (for
example, @option{--sn}) or @option{--forcereadstd} is called.
+@noindent
+One line examples:
-This value is a per-pixel value, not per object/clump and is not found over an
area or aperture, like the common @mymath{5\sigma} values that are commonly
reported as a measure of depth or the upper-limit measurements (see
@ref{Quantifying measurement limits}).
+@example
+## Create catalog with RA, Dec, Magnitude and Magnitude error,
+## from Segment's output:
+$ astmkcatalog --ra --dec --magnitude seg-out.fits
-@item --sfmagarea=FLT
-Area (in arc-seconds squared) to convert the per-pixel estimation of
@option{--sfmagnsigma} in the comments section of the output tables.
-Note that the surface brightness limit is only reported when a standard
deviation image is read, in other words a column using it is requested (for
example, @option{--sn}) or @option{--forcereadstd} is called.
+## Same catalog as above (using short options):
+$ astmkcatalog -rdm seg-out.fits
-Note that this is just a unit conversion using the World Coordinate System
(WCS) information in the input's header.
-It does not actually do any measurements on this area.
-For random measurements on any area, please use the upper-limit columns of
MakeCatalog (see the discussion on upper-limit measurements in @ref{Quantifying
measurement limits}).
-@end table
+## Write the catalog to a text table:
+$ astmkcatalog -rdm seg-out.fits --output=cat.txt
+## Output columns specified in `columns.conf':
+$ astmkcatalog --config=columns.conf seg-out.fits
+## Use object and clump labels from a K-band image, but pixel values
+## from an i-band image.
+$ astmkcatalog K_segmented.fits --hdu=DETECTIONS --clumpscat \
+ --clumpsfile=K_segmented.fits --clumpshdu=CLUMPS \
+ --valuesfile=i_band.fits
+@end example
+@cindex Gaussian
+@noindent
+If MakeCatalog is to do processing (not printing help or option values), an
input labeled image should be provided.
+The options described in this section are those that are particular to
MakeProfiles.
+For operations that MakeProfiles shares with other programs (mainly involving
input/output or general processing steps), see @ref{Common options}.
+Also see @ref{Common program behavior} for some general characteristics of all
Gnuastro programs including MakeCatalog.
+The various measurements/columns of MakeCatalog are requested as options,
either on the command-line or in configuration files, see @ref{Configuration
files}.
+The full list of available columns is available in @ref{MakeCatalog
measurements}.
+Depending on the requested columns, MakeCatalog needs more than one input
dataset, for more details, please see @ref{MakeCatalog inputs and basic
settings}.
+The upper-limit measurements in particular need several configuration options
which are thoroughly discussed in @ref{Upper-limit settings}.
+Finally, in @ref{MakeCatalog output} the output file(s) created by MakeCatalog
are discussed.
+@menu
+* MakeCatalog inputs and basic settings:: Input files and basic settings.
+* Upper-limit settings:: Settings for upper-limit measurements.
+* MakeCatalog output:: File names of MakeCatalog's output table.
+@end menu
+@node MakeCatalog inputs and basic settings, Upper-limit settings, Invoking
astmkcatalog, Invoking astmkcatalog
+@subsubsection MakeCatalog inputs and basic settings
+MakeCatalog works by using a localized/labeled dataset (see @ref{MakeCatalog}).
+This dataset maps/labels pixels to a specific target (row number in the final
catalog) and is thus the only necessary input dataset to produce a minimal
catalog in any situation.
+Because it only has labels/counters, it must have an integer type (see
@ref{Numeric data types}), see below if your labels are in a floating point
container.
+When the requested measurements only need this dataset (for example,
@option{--geo-x}, @option{--geo-y}, or @option{--geo-area}), MakeCatalog will
not read any more datasets.
+Low-level measurements that only use the labeled image are rarely sufficient
for any high-level science case.
+Therefore necessary input datasets depend on the requested columns in each run.
+For example, let's assume you want the brightness/magnitude and
signal-to-noise ratio of your labeled regions.
+For these columns, you will also need to provide an extra dataset containing
values for every pixel of the labeled input (to measure magnitude) and another
for the Sky standard deviation (to measure error).
+All such auxiliary input files have to have the same size (number of pixels in
each dimension) as the input labeled image.
+Their numeric data type is irrelevant (they will be converted to 32-bit
floating point internally).
+For the full list of available measurements, see @ref{MakeCatalog
measurements}.
+The ``values'' dataset is used for measurements like brightness/magnitude, or
flux-weighted positions.
+If it is a real image, by default it is assumed to be already Sky-subtracted
prior to running MakeCatalog.
+If it is not, you use the @option{--subtractsky} option to, so MakeCatalog
reads and subtracts the Sky dataset before any processing.
+To obtain the Sky value, you can use the @option{--sky} option of
@ref{Statistics}, but the best recommended method is @ref{NoiseChisel}, see
@ref{Sky value}.
+MakeCatalog can also do measurements on sub-structures of detections.
+In other words, it can produce two catalogs.
+Following the nomenclature of Segment (see @ref{Segment}), the main labeled
input dataset is known as ``object'' labels and the (optional) sub-structure
input dataset is known as ``clumps''.
+If MakeCatalog is run with the @option{--clumpscat} option, it will also need
a labeled image containing clumps, similar to what Segment produces (see
@ref{Segment output}).
+Since clumps are defined within detected regions (they exist over signal, not
noise), MakeCatalog uses their boundaries to subtract the level of signal under
them.
+There are separate options to explicitly request a file name and HDU/extension
for each of the required input datasets as fully described below (with the
@option{--*file} format).
+When each dataset is in a separate file, these options are necessary.
+However, one great advantage of the FITS file format (that is heavily used in
astronomy) is that it allows the storage of multiple datasets in one file.
+So in most situations (for example, if you are using the outputs of
@ref{NoiseChisel} or @ref{Segment}), all the necessary input datasets can be in
one file.
+When none of the @option{--*file} options are given, MakeCatalog will assume
the necessary input datasets are in the file given as its argument (without any
option).
+When the Sky or Sky standard deviation datasets are necessary and the only
@option{--*file} option called is @option{--valuesfile}, MakeCatalog will
search for these datasets (with the default/given HDUs) in the file given to
@option{--valuesfile} (before looking into the main argument file).
+When the clumps image (necessary with the @option{--clumpscat} option) is
used, MakeCatalog looks into the (possibly existing) @code{NUMLABS} keyword for
the total number of clumps in the image (irrespective of how many objects there
are).
+If it is not present, it will count them and possibly re-label the clumps so
the clump labels always start with 1 and finish with the total number of clumps
in each object.
+The re-labeled clumps image will be stored with the @file{-clumps-relab.fits}
suffix.
+This can slightly slow-down the run.
+Note that @code{NUMLABS} is automatically written by Segment in its outputs,
so if you are feeding Segment's clump labels, you can benefit from the improved
speed.
+Otherwise, if you are creating the clumps label dataset manually, it may be
good to include the @code{NUMLABS} keyword in its header and also be sure that
there is no gap in the clump labels.
+For example, if an object has three clumps, they are labeled as 1, 2, 3.
+If they are labeled as 1, 3, 4, or any other combination of three positive
integers that are not an increment of the previous, you might get unknown
behavior.
+It may happen that your labeled objects image was created with a program that
only outputs floating point files.
+However, you know it only has integer valued pixels that are stored in a
floating point container.
+In such cases, you can use Gnuastro's Arithmetic program (see
@ref{Arithmetic}) to change the numerical data type of the image
(@file{float.fits}) to an integer type image (@file{int.fits}) with a command
like below:
+@example
+@command{$ astarithmetic float.fits int32 --output=int.fits}
+@end example
+To summarize: if the input file to MakeCatalog is the default/full output of
Segment (see @ref{Segment output}) you do not have to worry about any of the
@option{--*file} options below.
+You can just give Segment's output file to MakeCatalog as described in
@ref{Invoking astmkcatalog}.
+To feed NoiseChisel's output into MakeCatalog, just change the labeled
dataset's header (with @option{--hdu=DETECTIONS}).
+The full list of input dataset options and general setting options are
described below.
+@table @option
+@item -l FITS
+@itemx --clumpsfile=FITS
+The FITS file containing the labeled clumps dataset when @option{--clumpscat}
is called (see @ref{MakeCatalog output}).
+When @option{--clumpscat} is called, but this option is not, MakeCatalog will
look into the main input file (given as an argument) for the required
extension/HDU (value to @option{--clumpshdu}).
-@node Match, , MakeCatalog, Data analysis
-@section Match
+@item --clumpshdu=STR
+The HDU/extension of the clump labels dataset.
+Only pixels with values above zero will be considered.
+The clump labels dataset has to be an integer data type (see @ref{Numeric data
types}) and only pixels with a value larger than zero will be used.
+See @ref{Segment output} for a description of the expected format.
-Data can come come from different telescopes, filters, software and even
different configurations for a single software.
-As a result, one of the primary things to do after generating catalogs from
each of these sources (for example, with @ref{MakeCatalog}), is to find which
sources in one catalog correspond to which in the other(s).
-In other words, to `match' the two catalogs with each other.
+@item -v FITS
+@itemx --valuesfile=FITS
+The file name of the (sky-subtracted) values dataset.
+When any of the columns need values to associate with the input labels (for
example, to measure the sum of pixel values or magnitude of a galaxy, see
@ref{Brightness flux magnitude}), MakeCatalog will look into a ``values'' for
the respective pixel values.
+In most common processing, this is the actual astronomical image that the
labels were defined, or detected, over.
+The HDU/extension of this dataset in the given file can be specified with
@option{--valueshdu}.
+If this option is not called, MakeCatalog will look for the given extension in
the main input file.
-Gnuastro's Match program is in charge of such operations.
-The nearest objects in the two catalogs, within the given aperture, will be
found and given as output.
-The aperture can be a circle or an ellipse with any orientation.
+@item --valueshdu=STR/INT
+The name or number (counting from zero) of the extension containing the
``values'' dataset, see the descriptions above and those in
@option{--valuesfile} for more.
-@menu
-* Matching algorithms:: Different ways to find the match
-* Invoking astmatch:: Inputs, outputs and options of Match
-@end menu
+@item -s FITS/FLT
+@itemx --insky=FITS/FLT
+Sky value as a single number, or the file name containing a dataset (different
values per pixel or tile).
+The Sky dataset is only necessary when @option{--subtractsky} is called or
when a column directly related to the Sky value is requested (currently
@option{--sky}).
+This dataset may be a tessellation, with one element per tile (see
@option{--oneelempertile} of NoiseChisel's @ref{Processing options}).
-@node Matching algorithms, Invoking astmatch, Match, Match
-@subsection Matching algorithms
+When the Sky dataset is necessary but this option is not called, MakeCatalog
will assume it is an HDU/extension (specified by @option{--skyhdu}) in one of
the already given files.
+First it will look for it in the @option{--valuesfile} (if it is given) and
then the main input file (given as an argument).
-Matching involves two catalogs, let's call them catalog A (with N rows) and
catalog B (with M rows).
-The most basic matching algorithm that immediately comes to mind is this:
-for each row in A (let's call it @mymath{A_i}), go over all the rows in B
(@mymath{B_j}, where @mymath{0<j<M}) and calculate the distance
@mymath{|B_j-A_i|}.
-If this distance is less than a certain acceptable distance threshold (or
radius, or aperture), consider @mymath{A_i} and @mymath{B_j} as a match.
+By default the values dataset is assumed to be already Sky subtracted, so
+this dataset is not necessary for many of the columns.
-This basic parsing algorithm is very computationally expensive:
-@mymath{N\times M} distances have to measured, and calculating the distance
requires a square root and power of 2: in 2 dimensions it would be
@mymath{\sqrt{(B_{ix}-A_{ix})^2+(B_{iy}-A_{iy})^2}}.
-If an elliptical aperture is necessary, it can even get more complicated, see
@ref{Defining an ellipse and ellipsoid}.
-Such operations are not simple, and will consume many cycles of your CPU!
-As a result, this basic algorithm will become terribly slow as your datasets
grow in size.
-For example, when N or M exceed hundreds of thousands (which is common in the
current days with datasets like the European Space Agency's Gaia mission).
-Therefore that basic parsing algorithm will take too much time and more
efficient ways to @emph{find the nearest neighbor} need to be found.
-Gnuastro's Match currently has algorithms for finding the nearest neighbor:
+@item --skyhdu=STR
+HDU/extension of the Sky dataset, see @option{--skyfile}.
-@table @asis
-@item Sort-based
-In this algorithm, we will use a moving window over the sorted datasets:
+@item --subtractsky
+Subtract the sky value or dataset from the values file prior to any
+processing.
-@enumerate
-@item
-Sort the two datasets by their first coordinate.
-Therefore @mymath{A_i<A_j} (when @mymath{i<j}; only in first coordinate), and
similarly, sort the elements of B based on the first coordinate.
-@item
-Use the radial distance threshold to define the width of a moving interval
over both A and B.
-Therefore, with a single parsing of both simultaneously, for each A-point, we
can find all the elements in B that are sufficiently near to it (within the
requested aperture).
-@end enumerate
+@item -t STR/FLT
+@itemx --instd=STR/FLT
+Sky standard deviation value as a single number, or the file name containing a
dataset (different values per pixel or tile).
+With the @option{--variance} option you can tell MakeCatalog to interpret this
value/dataset as a variance image, not standard deviation.
-This method has some caveats:
-1) It requires sorting, which can again be slow on large numbers.
-2) It can only be done on a single CPU thread! So it cannot benefit from the
modern CPUs with many threads.
-3) There is no way to preserve intermediate information for future matches,
for example, this can greatly help when one of the matched datasets is always
the same.
-To use this sorting method in Match, use @option{--kdtree=disable}.
+@strong{Important note:} This must only be the SKY standard deviation or
variance (not including the signal's contribution to the error).
+In other words, the final standard deviation of a pixel depends on how much
signal there is in it.
+MakeCatalog will find the amount of signal within each pixel (while
subtracting the Sky, if @option{--subtractsky} is called) and account for the
extra error due to it's value (signal).
+Therefore if the input standard deviation (or variance) image also contains
the contribution of signal to the error, then the final error measurements will
be over-estimated.
-@item k-d tree based
-The k-d tree concept is much more abstract, but powerful (addressing all the
caveats of the sort-based method described above.).
-In short a k-d tree is a partitioning of a k-dimensional space (``k'' is just
a place-holder, so together with ``d'' for dimension, ``k-d'' means ``any
number of dimensions''!).
-The k-d tree of table A is another table with the same number of rows, but
only two integer columns: the integers contain the row indexs (counting from
zero) of the left and right ``branch'' (in the ``tree'') of that row.
-With a k-d tree we can find the nearest point with much fewer (statistically)
checks, compared to always parsing from the top-down.
-For more on the k-d tree concept and Gnuastro's implementation, please see
@ref{K-d tree}.
+@item --stdhdu=STR
+The HDU of the Sky value standard deviation image.
-When given two catalogs (like the command below), Gnuastro's Match will
internally construct a k-d tree for catalog A (the first catalog given to it)
and use the k-d tree of A, for finding the nearest B-point(s) to each A-point
(this is done in parallel on all available CPU threads, unless you specify a
certain number of threads to use with @option{--numthreads}, see
@ref{Multi-threaded operations})
-@example
-$ astmatch A.fits --ccol1=ra,dec B.fits --ccol2=RA,DEC \
- --aperture=1/3600
-@end example
-However, optionally, you can also build the k-d tree of A and save it into a
file, with a separate call to Match, like below
-@example
-$ astmatch A.fits --ccol1=ra,dec --kdtree=build \
- --output=A-kdtree.fits
-@end example
-This external k-d tree (@file{A-kdtree.fits}) can be fed to Match later (to
avoid having to reconstruct it every time you want to match a new catalog with
A) like below for matching both @file{B.fits} and @file{C.fits} with
@file{A.fits}.
-Note that the same @option{--kdtree} option above, is now given the file name
of the k-d tree, instead of @code{build}.
-@example
-$ astmatch A.fits --ccol1=ra,dec --kdtree=A-kdtree.fits \
- B.fits --ccol2=RA,DEC --aperture=1/3600 \
- --output=A-B.fits
-$ astmatch A.fits --ccol1=ra,dec --kdtree=A-kdtree.fits \
- C.fits --ccol2=RA,DEC --aperture=1/3600 \
- --output=A-C.fits
-@end example
+@item --variance
+The dataset given to @option{--instd} (and @option{--stdhdu} has the Sky
variance of every pixel, not the Sky standard deviation.
-Irrespective of how the k-d tree is made ready (by importing or by
constructing internally), it will be used to find the nearest A-point to each
B-point.
-The k-d tree is parsed independently (on different CPU threads) for each row
of B.
+@item --forcereadstd
+Read the input STD image even if it is not required by any of the requested
columns.
+This is because some of the output catalog's metadata may need it, for
example, to calculate the dataset's surface brightness limit (see
@ref{Quantifying measurement limits}, configured with @option{--sfmagarea} and
@option{--sfmagnsigma} in @ref{MakeCatalog output}).
-There is just one technical issue however: when there is no neighbor within
the acceptable distance of the k-d tree, it is forced to parse all elements to
confirm that there is no match!
-Therefore if one catalog only covers a small portion (in the coordinate space)
of the other catalog, the k-d tree algorithm will be forced to parse the full
k-d tree for the majority of points!
-This will dramatically decrease the running speed of Match.
-Therefore, Match first divides the range of the first input in all its
dimensions into bins that have a width of the requested aperture (similar to a
histogram), and will only do the k-d tree based search when the point in
catalog B actually falls within a bin that has at least one element in A.
-@end table
+Furthermore, if the input STD image does not have the @code{MEDSTD} keyword
(that is meant to contain the representative standard deviation of the full
image), with this option, the median will be calculated and used for the
surface brightness limit.
-Above, we described different ways of finding the @mymath{A_i} that is nearest
to each @mymath{B_j}.
-But this is not the whole matching process!
-Let's go ahead with a ``basic'' description of what happens next...
-You may be tempted to remove @mymath{A_i} from the search of matches for
@mymath{B_k} (where @mymath{k>j}).
-Therefore, as you go down B (and more matches are found), you have to
calculate less distances (there are fewer elements in A that remain to be
checked).
-However, this will introduce an important bias: @mymath{A_i} may actually be
closer to @mymath{B_k} than to @mymath{B_j}!
-But because @mymath{B_j} happened to be before @mymath{B_k} in your table,
@mymath{A_i} was removed from the potential search domain of @mymath{B_k}.
-The good match (@mymath{B_k} with @mymath{A_i} will therefore be lost, and
replaced by a false match between @mymath{B_j} and @mymath{A_i}!
+@item -z FLT
+@itemx --zeropoint=FLT
+The zero point magnitude for the input image, see @ref{Brightness flux
magnitude}.
-In a single-dimensional match, this bias depends on the sorting of your two
datasets (leading to different matches if you shuffle your datasets).
-But it will get more complex as you add dimensionality.
-For example, catalogs derived from 2D images or 3D cubes, where you have 2 and
3 different coordinates for each point.
+@item --sigmaclip FLT,FLT
+The sigma-clipping parameters when any of the sigma-clipping related columns
are requested (for example, @option{--sigclip-median} or
@option{--sigclip-number}).
-To address this problem, in Gnuastro (the Match program, or the matching
functions of the library) similar to above, we first parse over the elements of
B.
-But we will not associate the first nearest-neighbor with a match!
-Instead, we will use an array (with the number of rows in A, let's call it
``B-in-A'') to keep the list of all nearest element(s) in B that match each
A-point.
-Once all the points in B are parsed, each A-point in B-in-A will (possibly)
have a sorted list of B-points (there may be multiple B-points that fall within
the acceptable aperture of each A-point).
-In the previous example, the @mymath{i} element (corresponding to
@mymath{A_i}) of B-in-A will contain the following list of B-points:
@mymath{B_j} and @mymath{B_k}.
+This option takes two values: the first is the multiple of @mymath{\sigma},
and the second is the termination criteria.
+If the latter is larger than 1, it is read as an integer number and will be
the number of times to clip.
+If it is smaller than 1, it is interpreted as the tolerance level to stop
clipping.
+See @ref{Sigma clipping} for a complete explanation.
-A new array (with the number of points in B, let's call it A-in-B) is then
used to find the final match.
-We parse over B-in-A (that was completed above), and from it extract the
nearest B-point to each A-point (@mymath{B_k} for @mymath{A_i} in the example
above).
-If this is the first A-point that is found for this B-point, then we put this
A-point into A-in-B (in the example above, element @mymath{k} is filled with
@mymath{A_k}).
-If another A-point was previously found for this B-point, then the distance of
the two A-points to that B-point are compared, and the A-point with the smaller
distance is kept in A-in-B.
-This will give us the best match between the two catalogs, independent of any
sorting issues.
-Both the B-in-A and A-in-B will also keep the distances, so distances are only
measured once.
+@item --frac-max=FLT[,FLT]
+The fractions (one or two) of maximum value in objects or clumps to be used in
the related columns, for example, @option{--frac-max1-area},
@option{--frac-max1-sum} or @option{--frac-max1-radius}, see @ref{MakeCatalog
measurements}.
+For the maximum value, see the description of @option{--maximum} column below.
+The value(s) of this option must be larger than 0 and smaller than 1 (they are
a fraction).
+When only @option{--frac-max1-area} or @option{--frac-max1-sum} is requested,
one value must be given to this option, but if @option{--frac-max2-area} or
@option{--frac-max2-sum} are also requested, two values must be given to this
option.
+The values can be written as simple floating point numbers, or as fractions,
for example, @code{0.25,0.75} and @code{0.25,3/4} are the same.
-@noindent
-In summary, here are the points to consider when selecting an algorithm, or
the order of your inputs (for optimal speed, the match will be the same):
-@itemize
-@item
-For larger datasets, the k-d tree based method (when running on all threads
possible) is much more faster than the classical sort-based method.
-@item
-The k-d tree is constructed for the first input table and the multi-threading
is done on the rows of the second input table.
-The construction of a larger dataset's k-d tree will take longer, but
multi-threading will work better when you have more rows.
-As a result, the optimal way to place your inputs is to give the smaller input
table (with fewer rows) as the first argument (so its k-d tree is constructed),
and the larger table as the second argument (so its rows are checked in
parallel).
-@item
-If you always need to match against one catalog (that is large!), the k-d tree
construction itself can take a significant fraction of the running time.
-Therefore you can save its k-d tree into a file and simply give it to later
calls, like the example given in the description of the k-d algorithm mentioned
above.
-@end itemize
+@item --spatialresolution=FLT
+The error in measuring spatial properties (for example, the area) in units of
pixels.
+You can think of this as the FWHM of the dataset's PSF and is used in
measurements like the error in surface brightness (@option{--sb-error}, see
@ref{MakeCatalog measurements}).
+Ideally, images are taken in the optimal Nyquist sampling @ref{Sampling
theorem}, so the default value for this option is 2.
+But in practice real images my be over-sampled (usually ground-based images,
where you will need to increase the default value) or undersampled (some
space-based images, where you will need to decrease the default value).
-@node Invoking astmatch, , Matching algorithms, Match
-@subsection Invoking Match
+@item --inbetweenints
+Output will contain one row for all integers between 1 and the largest label
in the input (irrespective of their existance in the input image).
+By default, MakeCatalog's output will only contain rows with integers that
actually corresponded to at least one pixel in the input dataset.
-When given two catalogs, Match finds the rows that are nearest to each other
within an input aperture.
-The executable name is @file{astmatch} with the following general template
+For example, if the input's only labeled pixel values are 11 and 13,
MakeCatalog's default output will only have two rows.
+If you use this option, it will have 13 rows and all the columns corresponding
to integer identifiers that did not correspond to any pixel will be 0 or NaN
(depending on context).
+@end table
-@example
-$ astmatch [OPTION ...] input-1 input-2
-@end example
-@noindent
-One line examples:
-@example
-## 1D wavelength match (within 5 angstroms) of the two inputs.
-## The wavelengths are in the 5th and 10th columns respectively.
-$ astmatch --aperture=5e-10 --ccol1=5 --ccol2=10 in1.fits in2.txt
-## Find the row that is closest to (RA,DEC) of (12.3456,6.7890)
-## with a maximum distance of 1 arcseconds (1/3600 degrees).
-## The coordinates can also be given in sexagesimal.
-$ astmatch input1.txt --ccol1=ra,dec --coord=12.3456,6.7890 \
- --aperture=1/3600
-## Find matching rows of two catalogs with a circular aperture
-## of width 2 (same unit as position columns: pixels in this case).
-$ astmatch input1.txt input2.fits --aperture=2 \
- --ccol1=X,Y --ccol2=IMG_X,IMG_Y
+@node Upper-limit settings, MakeCatalog output, MakeCatalog inputs and basic
settings, Invoking astmkcatalog
+@subsubsection Upper-limit settings
-## Similar to before, but the output is created by merging various
-## columns from the two inputs: columns 1, RA, DEC from the first
-## input, followed by all columns starting with `MAG' and the `BRG'
-## column from second input and the 10th column from first input.
-$ astmatch input1.txt input2.fits --aperture=1/3600 \
- --ccol1=ra,dec --ccol2=RAJ2000,DEJ2000 \
- --outcols=a1,aRA,aDEC,b/^MAG/,bBRG,a10
+The upper-limit magnitude was discussed in @ref{Quantifying measurement
limits}.
+Unlike other measured values/columns in MakeCatalog, the upper limit magnitude
needs several extra parameters which are discussed here.
+All the options specific to the upper-limit measurements start with
@option{up} for ``upper-limit''.
+The only exception is @option{--envseed} that is also present in other
programs and is general for any job requiring random number generation in
Gnuastro (see @ref{Generating random numbers}).
-## Assuming both inputs have the same column metadata (same name
-## and numeric type), the output will contain all the rows of the
-## first input, appended with the non-matching rows of the second
-## input (good when you need to merge multiple catalogs that
-## may have matching items, which you do not want to repeat).
-$ astmatch input1.fits input2.fits --ccol1=RA,DEC --ccol2=RA,DEC \
- --aperture=1/3600 --notmatched --outcols=_all
+@cindex Reproducibility
+One very important consideration in Gnuastro is reproducibility.
+Therefore, the values to all of these parameters along with others (like the
random number generator type and seed) are also reported in the comments of the
final catalog when the upper limit magnitude column is desired.
+The random seed that is used to define the random positions for each object or
clump is unique and set based on the (optionally) given seed, the total number
of objects and clumps and also the labels of the clumps and objects.
+So with identical inputs, an identical upper-limit magnitude will be found.
+However, even if the seed is identical, when the ordering of the object/clump
labels differs between different runs, the result of upper-limit measurements
will not be identical.
-## Match the two catalogs within an elliptical aperture of 1 and 2
-## arc-seconds along RA and Dec respectively.
-$ astmatch --aperture=1/3600,2/3600 in1.fits in2.txt
+MakeCatalog will randomly place the object/clump footprint over the dataset.
+When the randomly placed footprint does not fall on any object or masked
region (see @option{--upmaskfile}) it will be used in the final distribution.
+Otherwise that particular random position will be ignored and another random
position will be generated.
+Finally, when the distribution has the desired number of successfully measured
random samples (@option{--upnum}) the distribution's properties will be
measured and placed in the catalog.
-## Match the RA and DEC columns of the first input with the RA_D
-## and DEC_D columns of the second within a 0.5 arcseconds aperture.
-$ astmatch --ccol1=RA,DEC --ccol2=RA_D,DEC_D --aperture=0.5/3600 \
- in1.fits in2.fits
+When the profile is very large or the image is significantly covered by
detections, it might not be possible to find the desired number of samplings in
a reasonable time.
+MakeProfiles will continue searching until it is unable to find a successful
position (since the last successful measurement@footnote{The counting of failed
positions restarts on every successful measurement.}), for a large multiple of
@option{--upnum} (currently@footnote{In Gnuastro's source, this constant number
is defined as the @code{MKCATALOG_UPPERLIMIT_MAXFAILS_MULTIP} macro in
@file{bin/mkcatalog/main.h}, see @ref{Downloading the source}.} this is 10).
+If @option{--upnum} successful samples cannot be found until this limit is
reached, MakeCatalog will set the upper-limit magnitude for that object to NaN
(blank).
-## Match in 3D (RA, Dec and Wavelength).
-$ astmatch --ccol1=2,3,4 --ccol2=2,3,4 -a0.5/3600,0.5/3600,5e-10 \
- in1.fits in2.txt
-@end example
+MakeCatalog will also print a warning if the range of positions available for
the labeled region is smaller than double the size of the region.
+In such cases, the limited range of random positions can artificially decrease
the standard deviation of the final distribution.
+If your dataset can allow it (it is large enough), it is recommended to use a
larger range if you see such warnings.
-Match will find the rows that are nearest to each other in two catalogs (given
some coordinate columns).
-Alternatively, it can construct the k-d tree of one catalog to save in a FITS
file for future matching of the same catalog with many others.
-To understand the inner working of Match and its algorithms, see @ref{Matching
algorithms}.
+@table @option
-When matching, two catalogs are necessary for input.
-But for constructing a k-d tree, only a single catalog should be given.
-The input tables can be plain text tables or FITS tables, for more see
@ref{Tables}.
-But other ways of feeding inputs area also supported:
-@itemize
-@item
-The @emph{first} catalog can also come from the standard input (for example, a
pipe that feeds the output of a previous command to Match, see @ref{Standard
input});
-@item
-When you only want to match one point with another catalog, you can use the
@option{--coord} option to avoid creating a file for the @emph{second} input
catalog.
-@end itemize
+@item --upmaskfile=FITS
+File name of mask image to use for upper-limit calculation.
+In some cases (especially when doing matched photometry), the object labels
specified in the main input and mask image might not be adequate.
+In other words they do not necessarily have to cover @emph{all} detected
objects: the user might have selected only a few of the objects in their
labeled image.
+This option can be used to ignore regions in the image in these situations
when estimating the upper-limit magnitude.
+All the non-zero pixels of the image specified by this option (in the
@option{--upmaskhdu} extension) will be ignored in the upper-limit magnitude
measurements.
-Match follows the same basic behavior of all Gnuastro programs as fully
described in @ref{Common program behavior}.
-If the first input is a FITS file, the common @option{--hdu} option (see
@ref{Input output options}) should be used to identify the extension.
-When the second input is FITS, the extension must be specified with
@option{--hdu2}.
+For example, when you are using labels from another image, you can give
NoiseChisel's objects image output for this image as the value to this option.
+In this way, you can be sure that regions with data do not harm your
distribution.
+See @ref{Quantifying measurement limits} for more on the upper limit magnitude.
-When @option{--quiet} is not called, Match will print its various processing
phases (including the number of matches found) in standard output (on the
command-line).
-When matches are found, by default, two tables will be output (if in FITS
format, as two HDUs).
-Each output table will contain the re-arranged rows of the respective input
table.
-In other words, both tables will have the same number of rows, and row N in
both corresponds to the 10th match between the two.
-If no matches are found, the columns of the output table(s) will have zero
rows (with proper meta-data).
-The output format can be changed with the following options:
-@itemize
-@item
-@option{--outcols}: The output will be a single table with rows chosen from
either of the two inputs in any order.
-@item
-@option{--notmatched}: The output tables will contain the rows that did not
match between the two tables.
-If called with @option{--outcols}, the output will be a single table with all
non-matched rows of both tables.
-@item
-@option{--logasoutput}: The output will be a single table with the contents of
the log file, see below.
-@end itemize
+@item --upmaskhdu=STR
+The extension in the file specified by @option{--upmask}.
-If no output file name is given with the @option{--output} option, then
automatic output @ref{Automatic output} will be used to determine the output
name(s).
-Depending on @option{--tableformat} (see @ref{Input output options}), the
output will be a (possibly multi-extension) FITS file or (possibly two) plain
text file(s).
-Generally, giving a filename to @option{--output} is recommended.
+@item --upnum=INT
+The number of random samples to take for all the objects.
+A larger value to this option will give a more accurate result
(asymptotically), but it will also slow down the process.
+When a randomly positioned sample overlaps with a detected/masked pixel it is
not counted and another random position is found until the object completely
lies over an undetected region.
+So you can be sure that for each object, this many samples over undetected
objects are made.
+See the upper limit magnitude discussion in @ref{Quantifying measurement
limits} for more.
-When the @option{--log} option is called (see @ref{Operating mode options}),
and there was a match, Match will also create a file named @file{astmatch.fits}
(or @file{astmatch.txt}, depending on @option{--tableformat}, see @ref{Input
output options}) in the directory it is run in.
-This log table will have three columns.
-The first and second columns show the matching row/record number (counting
from 1) of the first and second input catalogs respectively.
-The third column is the distance between the two matched positions.
-The units of the distance are the same as the given coordinates (given the
possible ellipticity, see description of @option{--aperture} below).
-When @option{--logasoutput} is called, no log file (with a fixed name) will be
created.
-In this case, the output file (possibly given by the @option{--output} option)
will have the contents of this log file.
+@item --uprange=INT,INT
+The range/width of the region (in pixels) to do random sampling along each
dimension of the input image around each object's position.
+This is not a mandatory option and if not given (or given a value of zero in a
dimension), the full possible range of the dataset along that dimension will be
used.
+This is useful when the noise properties of the dataset vary gradually.
+In such cases, using the full range of the input dataset is going to bias the
result.
+However, note that decreasing the range of available positions too much will
also artificially decrease the standard deviation of the final distribution
(and thus bias the upper-limit measurement).
-@cartouche
-@noindent
-@strong{@option{--log} is not thread-safe}: As described above, when
@option{--logasoutput} is not called, the Log file has a fixed name for all
calls to Match.
-Therefore if a separate log is requested in two simultaneous calls to Match in
the same directory, Match will try to write to the same file.
-This will cause problems like unreasonable log file, undefined behavior, or a
crash.
-Remember that @option{--log} is mainly intended for debugging purposes, if you
want the log file with a specific name, simply use @option{--logasoutput}
(which will also be faster, since no arranging of the input columns is
necessary).
-@end cartouche
+@item --envseed
+@cindex Seed, Random number generator
+@cindex Random number generator, Seed
+Read the random number generator type and seed value from the environment (see
@ref{Generating random numbers}).
+Random numbers are used in calculating the random positions of different
samples of each object.
-@table @option
-@item -H STR
-@itemx --hdu2=STR
-The extension/HDU of the second input if it is a FITS file.
-When it is not a FITS file, this option's value is ignored.
-For the first input, the common option @option{--hdu} must be used.
+@item --upsigmaclip=FLT,FLT
+The raw distribution of random values will not be used to find the upper-limit
magnitude, it will first be @mymath{\sigma}-clipped (see @ref{Sigma clipping})
to avoid outliers in the distribution (mainly the faint undetected wings of
bright/large objects in the image).
+This option takes two values: the first is the multiple of @mymath{\sigma},
and the second is the termination criteria.
+If the latter is larger than 1, it is read as an integer number and will be
the number of times to clip.
+If it is smaller than 1, it is interpreted as the tolerance level to stop
clipping. See @ref{Sigma clipping} for a complete explanation.
-@item -k STR
-@itemx --kdtree=STR
-Select the algorithm and/or the way to construct or import the k-d tree.
-A summary of the four acceptable strings for this option are described here
for completeness.
-However, for a much more detailed discussion on Match's algorithms with
examples, see @ref{Matching algorithms}.
-@table @code
-@item internal
-Construct a k-d tree for the first input internally (within the same run of
Match), and parallelize over the rows of the second to find the nearest points.
-This is the default algorithm/method used by Match (when this option is not
called).
-@item build
-Only construct a k-d tree of a single input and abort.
-The name of the k-d tree is value to @option{--output}.
-@item CUSTOM-FITS-FILE
-Use the given FITS file as a k-d tree (that was previously constructed with
Match itself) of the first input, and do not construct any k-d tree internally.
-The FITS file should have two columns with an unsigned 32-bit integer data
type and a @code{KDTROOT} keyword that contains the index of the root of the
k-d tree.
-For more on Gnuastro's k-d tree format, see @ref{K-d tree}.
-@item disable
-Do Not use the k-d tree algorithm for finding the nearest neighbor, instead,
use the sort-based method.
-@end table
+@item --upnsigma=FLT
+The multiple of the final (@mymath{\sigma}-clipped) standard deviation (or
@mymath{\sigma}) used to measure the upper-limit sum or magnitude.
-@item --kdtreehdu=STR
-The HDU of the FITS file, when a FITS file is given to the @option{--kdtree}
option that was described above.
+@item --checkuplim=INT[,INT]
+Print a table of positions and measured values for all the full random
distribution used for one particular object or clump.
+If only one integer is given to this option, it is interpreted to be an
object's label.
+If two values are given, the first is the object label and the second is the
ID of requested clump within it.
-@item --outcols=STR[,STR,[...]]
-Columns (from both inputs) to write into a single matched table output.
-The value to @code{--outcols} must be a comma-separated list of column
identifiers (number or name, see @ref{Selecting table columns}).
-The expected format depends on @option{--notmatched} and explained below.
-By default (when @option{--nomatched} is not called), the number of rows in
the output will be equal to the number of matches.
-However, when @option{--notmatched} is called, all the rows (from the
requested columns) of the first input are placed in the output, and the
not-matched rows of the second input are inserted afterwards (useful when you
want to merge unique entries of multiple catalogs into one).
+The output is a table with three columns (whether it is FITS or plain-text is
determined with the @option{--tableformat} option, see @ref{Input output
options}).
+The first two columns are the pixel X,Y positions of the center of each
label's tile (see next paragraph), in each random sampling of this particular
object/clump.
+The third column is the measured flux over that region.
+If the region overlapped with a detection or masked pixel, then its measured
value will be a NaN (not-a-number).
+The total number of rows is thus unknown before running.
+However, if an upper-limit measurement was made in the main output of
MakeCatalog, you can be sure that the number of rows with non-NaN measurements
is the number given to the @option{--upnum} option.
+
+The ``tile'' of each label is defined by the minimum and maximum positions of
each label: values of the @option{--min-x}, @option{--max-x}, @option{--min-y}
and @option{--max-y} columns in the main output table for each label.
+Therefore, the tile center position that is recorded in the output of this
column ignores the distribution of labeled pixels within the tile.
+
+Precise interpretation of the position is only relevant when the footprint of
your label is highly un-symmetrical and you want to use this catalog to insert
your object into the image.
+In such a case, you can also ask for @option{--min-x} and @option{--min-y} and
manually calculate their difference with the following two positional
measurements of your desired label: @option{--geo-x} and @option{--geo-y}
(which report the label's ``geometric'' center; only using the label positions
ignoring any ``values'') or @option{--x} and @option{--y} (which report the
value-weighted center of the label).
+Adding the difference with the position reported by this column, will let you
define alternative ``center''s for your label in particular situations (this
will usually not be necessary!).
+For more on these positional columns, see @ref{Position measurements in
pixels}.
+@end table
-@table @asis
-@item Default (only matching rows)
-The first character of each string specifies the input catalog: @option{a} for
the first and @option{b} for the second.
-The rest of the characters of the string will be directly used to identify the
proper column(s) in the respective table.
-See @ref{Selecting table columns} for how columns can be specified in Gnuastro.
-For example, the output of @option{--outcols=a1,bRA,bDEC} will have three
columns: the first column of the first input, along with the @option{RA} and
@option{DEC} columns of the second input.
+@node MakeCatalog output, , Upper-limit settings, Invoking astmkcatalog
+@subsubsection MakeCatalog output
+After it has completed all the requested measurements (see @ref{MakeCatalog
measurements}), MakeCatalog will store its measurements in table(s).
+If an output filename is given (see @option{--output} in @ref{Input output
options}), the format of the table will be deduced from the name.
+When it is not given, the input name will be appended with a @file{_cat}
suffix (see @ref{Automatic output}) and its format will be determined from the
@option{--tableformat} option, which is also discussed in @ref{Input output
options}.
+@option{--tableformat} is also necessary when the requested output name is a
FITS table (recall that FITS can accept ASCII and binary tables, see
@ref{Table}).
-If the string after @option{a} or @option{b} is @option{_all}, then all the
columns of the respective input file will be written in the output.
-For example, the command below will print all the input columns from the first
catalog along with the 5th column from the second:
+By default (when @option{--spectrum} or @option{--clumpscat} are not called)
only a single catalog/table will be created for the labeled ``objects''.
-@example
-$ astmatch a.fits b.fits --outcols=a_all,b5
-@end example
+@itemize
+@item
+if @option{--clumpscat} is called, a secondary catalog/table will also be
created for ``clumps'' (one of the outputs of the Segment program, for more on
``objects'' and ``clumps'', see @ref{Segment}).
+In short, if you only have one labeled image, you do not have to worry about
clumps and just ignore this.
+@item
+When @option{--spectrum} is called, it is not mandatory to specify any
single-valued measurement columns. In this case, the output will only be the
spectra of each labeled region within a 3D datacube.
+For more, see the description of @option{--spectrum} in @ref{MakeCatalog
measurements}.
+@end itemize
-@code{_all} can be used multiple times, possibly on both inputs.
-Tip: if an input's column is called @code{_all} (an unlikely name!) and you do
not want all the columns from that table the output, use its column number to
avoid confusion.
+@cindex Surface brightness limit
+@cindex Limit, Surface brightness
+When possible, MakeCatalog will also measure the full input's noise level
(also known as surface brightness limit, see @ref{Quantifying measurement
limits}).
+Since these measurements are related to the noise and not any particular
labeled object, they are stored as keywords in the output table.
+Furthermore, they are only possible when a standard deviation image has been
loaded (done automatically for any column measurement that involves noise, for
example, @option{--sn}, @option{--magnitude-error} or @option{--sky-std}).
+But if you just want the surface brightness limit and no noise-related column,
you can use @option{--forcereadstd}.
+All these keywords start with @code{SBL} (for ``surface brightness limit'')
and are described below:
-Another example is given in the one-line examples above.
-Compared to the default case (where two tables with all their columns) are
saved separately, using this option is much faster: it will only read and
re-arrange the necessary columns and it will write a single output table.
-Combined with regular expressions in large tables, this can be a very powerful
and convenient way to merge various tables into one.
+@table @code
+@item SBLSTD
+Per-pixel standard deviation.
+If a @code{MEDSTD} keyword exists in the standard deviation dataset, then that
value is directly used.
-When @option{--coord} is given, no second catalog will be read.
-The second catalog will be created internally based on the values given to
@option{--coord}.
-So column names are not defined and you can only request integer column
numbers that are less than the number of coordinates given to @option{--coord}.
-For example, if you want to find the row matching RA of 1.2345 and Dec of
6.7890, then you should use @option{--coord=1.2345,6.7890}.
-But when using @option{--outcols}, you cannot give @code{bRA}, or @code{b25}.
+@item SBLNSIG
+Sigma multiple for surface brightness limit (value you gave to
@option{--sfmagnsigma}), used for @code{SBLMAGPX} and @code{SBLMAG}.
-@item With @option{--notmatched}
-Only the column names/numbers should be given (for example,
@option{--outcols=RA,DEC,MAGNITUDE}).
-It is assumed that both input tables have the requested column(s) and that the
numerical data types of each column in each input (with same name) is the same
as the corresponding column in the other.
-Therefore if one input has a @code{MAGNITUDE} column with a 32-bit floating
point type, but the @code{MAGNITUDE} column of the other is 64-bit floating
point, Match will crash with an error.
-The metadata of the columns will come from the first input.
+@item SBLMAGPX
+Per-pixel surface brightness limit (in units of magnitudes/pixel).
-As an example, let's assume @file{input1.txt} and @file{input2.fits} each have
a different number of columns and rows.
-However, they both have the @code{RA} (64-bit floating point), @code{DEC}
(64-bit floating point) and @code{MAGNITUDE} (32-bit floating point) columns.
-If @file{input1.txt} has 100 rows and @file{input2.fits} has 300 rows (such
that 50 of them match within 1 arcsec of the first), then the output of the
command above will have @mymath{100+(300-50)=350} rows and only three columns.
-Other columns in each catalog, which may be different, are ignored.
+@item SBLAREA
+Area (in units of arcsec@mymath{^2}) used in @code{SBLMAG} (value you gave to
@option{--sfmagarea}).
-@example
-$ astmatch input1.txt --ccol1=RA,DEC \
- input2.fits --ccol2=RA,DEC \
- --aperture=1/3600 \
- --notmatched --outcols=RA,DEC,MAGNITUDE
-@end example
+@item SBLMAG
+Surface brightness limit of data calculated over @code{SBLAREA} (in units of
mag/arcsec@mymath{^2}).
@end table
-@item -l
-@itemx --logasoutput
-The output file will have the contents of the log file: indexes in the two
catalogs that match with each other along with their distance, see description
of the log file above.
+When any of the upper-limit measurements are requested, the input parameters
for the upper-limit measurement are stored in the keywords starting with
@code{UP}: @code{UPNSIGMA}, @code{UPNUMBER}, @code{UPRNGNAM}, @code{UPRNGSEE},
@code{UPSCMLTP}, @code{UPSCTOL}.
+These are primarily input arguments, so they correspond to the options with a
similar name.
-When this option is called, a separate log file will not be created and the
output will not contain any of the input columns (either as two tables
containing the re-arranged columns of each input, or a single table mixing
columns), only their indices in the log format.
+The full list of MakeCatalog's options relating to the output file format and
keywords are listed below.
+See @ref{MakeCatalog measurements} for specifying which columns you want in
the final catalog.
-@item --notmatched
-Write the non-matching rows into the outputs, not the matched ones.
-By default, this will produce two output tables, that will not necessarily
have the same number of rows.
-However, when called with @option{--outcols}, it is possible to import
non-matching rows of the second into the first.
-See the description of @option{--outcols} for more.
+@table @option
+@item -C
+@itemx --clumpscat
+Do measurements on clumps and produce a second catalog (only devoted to
clumps).
+When this option is given, MakeCatalog will also look for a secondary labeled
dataset (identifying substructure) and produce a catalog from that.
+For more on the definition on ``clumps'', see @ref{Segment}.
-@item -c INT/STR[,INT/STR]
-@itemx --ccol1=INT/STR[,INT/STR]
-The coordinate columns of the first input.
-The number of dimensions for the match is determined by the number of
comma-separated values given to this option.
-The values can be the column number (counting from 1), exact column name or a
regular expression.
-For more, see @ref{Selecting table columns}.
-See the one-line examples above for some usages of this option.
+When the output is a FITS file, the objects and clumps catalogs/tables will be
stored as multiple extensions of one FITS file.
+You can use @ref{Table} to inspect the column meta-data and contents in this
case.
+However, in plain text format (see @ref{Gnuastro text table format}), it is
only possible to keep one table per file.
+Therefore, if the output is a text file, two output files will be created,
ending in @file{_o.txt} (for objects) and @file{_c.txt} (for clumps).
-@item -C INT/STR[,INT/STR]
-@itemx --ccol2=INT/STR[,INT/STR]
-The coordinate columns of the second input.
-See the example in @option{--ccol1} for more.
+@item --noclumpsort
+Do Not sort the clumps catalog based on object ID (only relevant with
@option{--clumpscat}).
+This option will benefit the performance@footnote{The performance boost due to
@option{--noclumpsort} can only be felt when there are a huge number of objects.
+Therefore, by default the output is sorted to avoid miss-understandings or
bugs in the user's scripts when the user forgets to sort the outputs.} of
MakeCatalog when it is run on multiple threads @emph{and} the position of the
rows in the clumps catalog is irrelevant (for example, you just want the
number-counts).
-@item -d FLT[,FLT]
-@itemx --coord=FLT[,FLT]
-Manually specify the coordinates to match against the given catalog.
-With this option, Match will not look for a second input file/table and will
directly use the coordinates given to this option.
-When the coordinates are RA and Dec, the comma-separated values can either be
in degrees (a single number), or sexagesimal (@code{_h_m_} for RA, @code{_d_m_}
for Dec, or @code{_:_:_} for both).
+MakeCatalog does all its measurements on each @emph{object} independently and
in parallel.
+As a result, while it is writing the measurements on each object's clumps, it
does not know how many clumps there were in previous objects.
+Each thread will just fetch the first available row and write the information
of clumps (in order) starting from that row.
+After all the measurements are done, by default (when this option is not
called), MakeCatalog will reorder/permute the clumps catalog to have both the
object and clump ID in an ascending order.
-When this option is called, the output changes in the following ways:
-1) when @option{--outcols} is specified, for the second input, it can only
accept integer numbers that are less than the number of values given to this
option, see description of that option for more.
-2) By default (when @option{--outcols} is not used), only the matching row of
the first table will be output (a single file), not two separate files (one for
each table).
+If you would like to order the catalog later (when it is a plain text file),
you can run the following command to sort the rows by object ID (and clump ID
within each object), assuming they are respectively the first and second
columns:
-This option is good when you have a (large) catalog and only want to match a
single coordinate to it (for example, to find the nearest catalog entry to your
desired point).
-With this option, you can write the coordinates on the command-line and thus
avoid the need to make a single-row file.
+@example
+$ awk '!/^#/' out_c.txt | sort -g -k1,1 -k2,2
+@end example
-@item -a FLT[,FLT[,FLT]]
-@itemx --aperture=FLT[,FLT[,FLT]]
-Parameters of the aperture for matching.
-The values given to this option can be fractions, for example, when the
position columns are in units of degrees, @option{1/3600} can be used to ask
for one arc-second.
-The interpretation of the values depends on the requested dimensions
(determined from @option{--ccol1} and @code{--ccol2}) and how many values are
given to this option.
+@item --sfmagnsigma=FLT
+Value to multiply with the median standard deviation (from a @command{MEDSTD}
keyword in the Sky standard deviation image) for estimating the surface
brightness limit.
+Note that the surface brightness limit is only reported when a standard
deviation image is read, in other words a column using it is requested (for
example, @option{--sn}) or @option{--forcereadstd} is called.
-When multiple objects are found within the aperture, the match is defined
-as the nearest one. In a multi-dimensional dataset, when the aperture is a
-general ellipse or ellipsoid (and not a circle or sphere), the distance is
-calculated in the elliptical space along the major axis. For the defintion
-of this distance, see @mymath{r_{el}} in @ref{Defining an ellipse and
-ellipsoid}.
+This value is a per-pixel value, not per object/clump and is not found over an
area or aperture, like the common @mymath{5\sigma} values that are commonly
reported as a measure of depth or the upper-limit measurements (see
@ref{Quantifying measurement limits}).
-@table @asis
-@item 1D match
-The aperture/interval can only take one value: half of the interval around
each point (maximum distance from each point).
+@item --sfmagarea=FLT
+Area (in arc-seconds squared) to convert the per-pixel estimation of
@option{--sfmagnsigma} in the comments section of the output tables.
+Note that the surface brightness limit is only reported when a standard
deviation image is read, in other words a column using it is requested (for
example, @option{--sn}) or @option{--forcereadstd} is called.
-@item 2D match
-In a 2D match, the aperture can be a circle, an ellipse aligned in the axes or
an ellipse with a rotated major axis.
-To simply the usage, you can determine the shape based on the number of free
parameters for each.
+Note that this is just a unit conversion using the World Coordinate System
(WCS) information in the input's header.
+It does not actually do any measurements on this area.
+For random measurements on any area, please use the upper-limit columns of
MakeCatalog (see the discussion on upper-limit measurements in @ref{Quantifying
measurement limits}).
+@end table
-@table @asis
-@item 1 number
-for example, @option{--aperture=2}.
-The aperture will be a circle of the given radius.
-The value will be in the same units as the columns in @option{--ccol1} and
@option{--ccol2}).
-@item 2 numbers
-for example, @option{--aperture=3,4e-10}.
-The aperture will be an ellipse (if the two numbers are different) with the
respective value along each dimension.
-The numbers are in units of the first and second axis.
-In the example above, the semi-axis value along the first axis will be 3 (in
units of the first coordinate) and along the second axis will be
@mymath{4\times10^{-10}} (in units of the second coordinate).
-Such values can happen if you are comparing catalogs of a spectra for example.
-If more than one object exists in the aperture, the nearest will be found
along the major axis as described in @ref{Defining an ellipse and ellipsoid}.
-@item 3 numbers
-for example, @option{--aperture=2,0.6,30}.
-The aperture will be an ellipse (if the second value is not 1).
-The first number is the semi-major axis, the second is the axis ratio and the
third is the position angle (in degrees).
-If multiple matches are found within the ellipse, the distance (to find the
nearest) is calculated along the major axis in the elliptical space, see
@ref{Defining an ellipse and ellipsoid}.
-@end table
-@item 3D match
-The aperture (matching volume) can be a sphere, an ellipsoid aligned on the
three axes or a genenral ellipsoid rotated in any direction.
-To simplifythe usage, the shape can be determined based on the number of
values given to this option.
-@table @asis
-@item 1 number
-for example, @option{--aperture=3}.
-The matching volume will be a sphere of the given radius.
-The value is in the same units as the input coordinates.
-@item 3 numbers
-for example, @option{--aperture=4,5,6e-10}.
-The aperture will be a general ellipsoid with the respective extent along each
dimension.
-The numbers must be in the same units as each axis.
-This is very similar to the two number case of 2D inputs.
-See there for more.
-@item 6 numbers
-for example, @option{--aperture=4,0.5,0.6,10,20,30}.
-The numbers represent the full general ellipsoid definition (in any
orientation).
-For the definition of a general ellipsoid, see @ref{Defining an ellipse and
ellipsoid}.
-The first number is the semi-major axis.
-The second and third are the two axis ratios.
-The last three are the three Euler angles in units of degrees in the ZXZ order
as fully described in @ref{Defining an ellipse and ellipsoid}.
-@end table
-@end table
-@end table
@@ -28496,1443 +28591,1551 @@ The last three are the three Euler angles in
units of degrees in the ZXZ order a
+@node Match, , MakeCatalog, Data analysis
+@section Match
+Data can come come from different telescopes, filters, software and even
different configurations for a single software.
+As a result, one of the primary things to do after generating catalogs from
each of these sources (for example, with @ref{MakeCatalog}), is to find which
sources in one catalog correspond to which in the other(s).
+In other words, to `match' the two catalogs with each other.
+Gnuastro's Match program is in charge of such operations.
+The nearest objects in the two catalogs, within the given aperture, will be
found and given as output.
+The aperture can be a circle or an ellipse with any orientation.
+@menu
+* Matching algorithms:: Different ways to find the match
+* Invoking astmatch:: Inputs, outputs and options of Match
+@end menu
+@node Matching algorithms, Invoking astmatch, Match, Match
+@subsection Matching algorithms
+Matching involves two catalogs, let's call them catalog A (with N rows) and
catalog B (with M rows).
+The most basic matching algorithm that immediately comes to mind is this:
+for each row in A (let's call it @mymath{A_i}), go over all the rows in B
(@mymath{B_j}, where @mymath{0<j<M}) and calculate the distance
@mymath{|B_j-A_i|}.
+If this distance is less than a certain acceptable distance threshold (or
radius, or aperture), consider @mymath{A_i} and @mymath{B_j} as a match.
+This basic parsing algorithm is very computationally expensive:
+@mymath{N\times M} distances have to measured, and calculating the distance
requires a square root and power of 2: in 2 dimensions it would be
@mymath{\sqrt{(B_{ix}-A_{ix})^2+(B_{iy}-A_{iy})^2}}.
+If an elliptical aperture is necessary, it can even get more complicated, see
@ref{Defining an ellipse and ellipsoid}.
+Such operations are not simple, and will consume many cycles of your CPU!
+As a result, this basic algorithm will become terribly slow as your datasets
grow in size.
+For example, when N or M exceed hundreds of thousands (which is common in the
current days with datasets like the European Space Agency's Gaia mission).
+Therefore that basic parsing algorithm will take too much time and more
efficient ways to @emph{find the nearest neighbor} need to be found.
+Gnuastro's Match currently has algorithms for finding the nearest neighbor:
+@table @asis
+@item Sort-based
+In this algorithm, we will use a moving window over the sorted datasets:
-@node Data modeling, High-level calculations, Data analysis, Top
-@chapter Data modeling
+@enumerate
+@item
+Sort the two datasets by their first coordinate.
+Therefore @mymath{A_i<A_j} (when @mymath{i<j}; only in first coordinate), and
similarly, sort the elements of B based on the first coordinate.
+@item
+Use the radial distance threshold to define the width of a moving interval
over both A and B.
+Therefore, with a single parsing of both simultaneously, for each A-point, we
can find all the elements in B that are sufficiently near to it (within the
requested aperture).
+@end enumerate
-@cindex Fitting
-@cindex Modeling
-In order to fully understand observations after initial analysis on the image,
it is very important to compare them with the existing models to be able to
further understand both the models and the data.
-The tools in this chapter create model galaxies and will provide 2D fittings
to be able to understand the detections.
+This method has some caveats:
+1) It requires sorting, which can again be slow on large numbers.
+2) It can only be done on a single CPU thread! So it cannot benefit from the
modern CPUs with many threads.
+3) There is no way to preserve intermediate information for future matches,
for example, this can greatly help when one of the matched datasets is always
the same.
+To use this sorting method in Match, use @option{--kdtree=disable}.
-@menu
-* MakeProfiles:: Making mock galaxies and stars.
-* MakeNoise:: Make (add) noise to an image.
-@end menu
+@item k-d tree based
+The k-d tree concept is much more abstract, but powerful (addressing all the
caveats of the sort-based method described above.).
+In short a k-d tree is a partitioning of a k-dimensional space (``k'' is just
a place-holder, so together with ``d'' for dimension, ``k-d'' means ``any
number of dimensions''!).
+The k-d tree of table A is another table with the same number of rows, but
only two integer columns: the integers contain the row indexs (counting from
zero) of the left and right ``branch'' (in the ``tree'') of that row.
+With a k-d tree we can find the nearest point with much fewer (statistically)
checks, compared to always parsing from the top-down.
+For more on the k-d tree concept and Gnuastro's implementation, please see
@ref{K-d tree}.
+When given two catalogs (like the command below), Gnuastro's Match will
internally construct a k-d tree for catalog A (the first catalog given to it)
and use the k-d tree of A, for finding the nearest B-point(s) to each A-point
(this is done in parallel on all available CPU threads, unless you specify a
certain number of threads to use with @option{--numthreads}, see
@ref{Multi-threaded operations})
+@example
+$ astmatch A.fits --ccol1=ra,dec B.fits --ccol2=RA,DEC \
+ --aperture=1/3600
+@end example
+However, optionally, you can also build the k-d tree of A and save it into a
file, with a separate call to Match, like below
+@example
+$ astmatch A.fits --ccol1=ra,dec --kdtree=build \
+ --output=A-kdtree.fits
+@end example
+This external k-d tree (@file{A-kdtree.fits}) can be fed to Match later (to
avoid having to reconstruct it every time you want to match a new catalog with
A) like below for matching both @file{B.fits} and @file{C.fits} with
@file{A.fits}.
+Note that the same @option{--kdtree} option above, is now given the file name
of the k-d tree, instead of @code{build}.
+@example
+$ astmatch A.fits --ccol1=ra,dec --kdtree=A-kdtree.fits \
+ B.fits --ccol2=RA,DEC --aperture=1/3600 \
+ --output=A-B.fits
+$ astmatch A.fits --ccol1=ra,dec --kdtree=A-kdtree.fits \
+ C.fits --ccol2=RA,DEC --aperture=1/3600 \
+ --output=A-C.fits
+@end example
+Irrespective of how the k-d tree is made ready (by importing or by
constructing internally), it will be used to find the nearest A-point to each
B-point.
+The k-d tree is parsed independently (on different CPU threads) for each row
of B.
+There is just one technical issue however: when there is no neighbor within
the acceptable distance of the k-d tree, it is forced to parse all elements to
confirm that there is no match!
+Therefore if one catalog only covers a small portion (in the coordinate space)
of the other catalog, the k-d tree algorithm will be forced to parse the full
k-d tree for the majority of points!
+This will dramatically decrease the running speed of Match.
+Therefore, Match first divides the range of the first input in all its
dimensions into bins that have a width of the requested aperture (similar to a
histogram), and will only do the k-d tree based search when the point in
catalog B actually falls within a bin that has at least one element in A.
+@end table
-@node MakeProfiles, MakeNoise, Data modeling, Data modeling
-@section MakeProfiles
+Above, we described different ways of finding the @mymath{A_i} that is nearest
to each @mymath{B_j}.
+But this is not the whole matching process!
+Let's go ahead with a ``basic'' description of what happens next...
+You may be tempted to remove @mymath{A_i} from the search of matches for
@mymath{B_k} (where @mymath{k>j}).
+Therefore, as you go down B (and more matches are found), you have to
calculate less distances (there are fewer elements in A that remain to be
checked).
+However, this will introduce an important bias: @mymath{A_i} may actually be
closer to @mymath{B_k} than to @mymath{B_j}!
+But because @mymath{B_j} happened to be before @mymath{B_k} in your table,
@mymath{A_i} was removed from the potential search domain of @mymath{B_k}.
+The good match (@mymath{B_k} with @mymath{A_i} will therefore be lost, and
replaced by a false match between @mymath{B_j} and @mymath{A_i}!
-@cindex Checking detection algorithms
-@pindex @r{MakeProfiles (}astmkprof@r{)}
-MakeProfiles will create mock astronomical profiles from a catalog, either
individually or together in one output image.
-In data analysis, making a mock image can act like a calibration tool, through
which you can test how successfully your detection technique is able to detect
a known set of objects.
-There are commonly two aspects to detecting: the detection of the fainter
parts of bright objects (which in the case of galaxies fade into the noise very
slowly) or the complete detection of an over-all faint object.
-Making mock galaxies is the most accurate (and idealistic) way these two
aspects of a detection algorithm can be tested.
-You also need mock profiles in fitting known functional profiles with
observations.
+In a single-dimensional match, this bias depends on the sorting of your two
datasets (leading to different matches if you shuffle your datasets).
+But it will get more complex as you add dimensionality.
+For example, catalogs derived from 2D images or 3D cubes, where you have 2 and
3 different coordinates for each point.
-MakeProfiles was initially built for extra galactic studies, so currently the
only astronomical objects it can produce are stars and galaxies.
-We welcome the simulation of any other astronomical object.
-The general outline of the steps that MakeProfiles takes are the following:
+To address this problem, in Gnuastro (the Match program, or the matching
functions of the library) similar to above, we first parse over the elements of
B.
+But we will not associate the first nearest-neighbor with a match!
+Instead, we will use an array (with the number of rows in A, let's call it
``B-in-A'') to keep the list of all nearest element(s) in B that match each
A-point.
+Once all the points in B are parsed, each A-point in B-in-A will (possibly)
have a sorted list of B-points (there may be multiple B-points that fall within
the acceptable aperture of each A-point).
+In the previous example, the @mymath{i} element (corresponding to
@mymath{A_i}) of B-in-A will contain the following list of B-points:
@mymath{B_j} and @mymath{B_k}.
-@enumerate
+A new array (with the number of points in B, let's call it A-in-B) is then
used to find the final match.
+We parse over B-in-A (that was completed above), and from it extract the
nearest B-point to each A-point (@mymath{B_k} for @mymath{A_i} in the example
above).
+If this is the first A-point that is found for this B-point, then we put this
A-point into A-in-B (in the example above, element @mymath{k} is filled with
@mymath{A_k}).
+If another A-point was previously found for this B-point, then the distance of
the two A-points to that B-point are compared, and the A-point with the smaller
distance is kept in A-in-B.
+This will give us the best match between the two catalogs, independent of any
sorting issues.
+Both the B-in-A and A-in-B will also keep the distances, so distances are only
measured once.
+@noindent
+In summary, here are the points to consider when selecting an algorithm, or
the order of your inputs (for optimal speed, the match will be the same):
+@itemize
@item
-Build the full profile out to its truncation radius in a possibly over-sampled
array.
-
+For larger datasets, the k-d tree based method (when running on all threads
possible) is much more faster than the classical sort-based method.
@item
-Multiply all the elements by a fixed constant so its total magnitude equals
the desired total magnitude.
-
+The k-d tree is constructed for the first input table and the multi-threading
is done on the rows of the second input table.
+The construction of a larger dataset's k-d tree will take longer, but
multi-threading will work better when you have more rows.
+As a result, the optimal way to place your inputs is to give the smaller input
table (with fewer rows) as the first argument (so its k-d tree is constructed),
and the larger table as the second argument (so its rows are checked in
parallel).
@item
-If @option{--individual} is called, save the array for each profile to a FITS
file.
+If you always need to match against one catalog (that is large!), the k-d tree
construction itself can take a significant fraction of the running time.
+Therefore you can save its k-d tree into a file and simply give it to later
calls, like the example given in the description of the k-d algorithm mentioned
above.
+@end itemize
-@item
-If @option{--nomerged} is not called, add the overlapping pixels of all the
created profiles to the output image and abort.
+@node Invoking astmatch, , Matching algorithms, Match
+@subsection Invoking Match
-@end enumerate
+When given two catalogs, Match finds the rows that are nearest to each other
within an input aperture.
+The executable name is @file{astmatch} with the following general template
-Using input values, MakeProfiles adds the World Coordinate System (WCS)
headers of the FITS standard to all its outputs (except PSF images!).
-For a simple test on a set of mock galaxies in one image, there is no need for
the third step or the WCS information.
+@example
+$ astmatch [OPTION ...] input-1 input-2
+@end example
-@cindex Transform image
-@cindex Lensing simulations
-@cindex Image transformations
-However in complicated simulations like weak lensing simulations, where each
galaxy undergoes various types of individual transformations based on their
position, those transformations can be applied to the different individual
images with other programs.
-After all the transformations are applied, using the WCS information in each
individual profile image, they can be merged into one output image for
convolution and adding noise.
+@noindent
+One line examples:
-@menu
-* Modeling basics:: Astronomical modeling basics.
-* If convolving afterwards:: Considerations for convolving later.
-* Profile magnitude:: Definition of total profile magnitude.
-* Invoking astmkprof:: Inputs and Options for MakeProfiles.
-@end menu
+@example
+## 1D wavelength match (within 5 angstroms) of the two inputs.
+## The wavelengths are in the 5th and 10th columns respectively.
+$ astmatch --aperture=5e-10 --ccol1=5 --ccol2=10 in1.fits in2.txt
+## Find the row that is closest to (RA,DEC) of (12.3456,6.7890)
+## with a maximum distance of 1 arcseconds (1/3600 degrees).
+## The coordinates can also be given in sexagesimal.
+$ astmatch input1.txt --ccol1=ra,dec --coord=12.3456,6.7890 \
+ --aperture=1/3600
+## Find matching rows of two catalogs with a circular aperture
+## of width 2 (same unit as position columns: pixels in this case).
+$ astmatch input1.txt input2.fits --aperture=2 \
+ --ccol1=X,Y --ccol2=IMG_X,IMG_Y
-@node Modeling basics, If convolving afterwards, MakeProfiles, MakeProfiles
-@subsection Modeling basics
+## Similar to before, but the output is created by merging various
+## columns from the two inputs: columns 1, RA, DEC from the first
+## input, followed by all columns starting with `MAG' and the `BRG'
+## column from second input and the 10th column from first input.
+$ astmatch input1.txt input2.fits --aperture=1/3600 \
+ --ccol1=ra,dec --ccol2=RAJ2000,DEJ2000 \
+ --outcols=a1,aRA,aDEC,b/^MAG/,bBRG,a10
-In the subsections below, first a review of some very basic information and
concepts behind modeling a real astronomical image is given.
-You can skip this subsection if you are already sufficiently familiar with
these concepts.
+## Assuming both inputs have the same column metadata (same name
+## and numeric type), the output will contain all the rows of the
+## first input, appended with the non-matching rows of the second
+## input (good when you need to merge multiple catalogs that
+## may have matching items, which you do not want to repeat).
+$ astmatch input1.fits input2.fits --ccol1=RA,DEC --ccol2=RA,DEC \
+ --aperture=1/3600 --notmatched --outcols=_all
-@menu
-* Defining an ellipse and ellipsoid:: Definition of these important shapes.
-* PSF:: Radial profiles for the PSF.
-* Stars:: Making mock star profiles.
-* Galaxies:: Radial profiles for galaxies.
-* Sampling from a function:: Sample a function on a pixelated canvas.
-* Oversampling:: Oversampling the model.
-@end menu
+## Match the two catalogs within an elliptical aperture of 1 and 2
+## arc-seconds along RA and Dec respectively.
+$ astmatch --aperture=1/3600,2/3600 in1.fits in2.txt
-@node Defining an ellipse and ellipsoid, PSF, Modeling basics, Modeling basics
-@subsubsection Defining an ellipse and ellipsoid
+## Match the RA and DEC columns of the first input with the RA_D
+## and DEC_D columns of the second within a 0.5 arcseconds aperture.
+$ astmatch --ccol1=RA,DEC --ccol2=RA_D,DEC_D --aperture=0.5/3600 \
+ in1.fits in2.fits
-@cindex Ellipse
-@cindex Axis ratio
-@cindex Position angle
-The PSF, see @ref{PSF}, and galaxy radial profiles are generally defined on an
ellipse.
-Therefore, in this section we will start defining an ellipse on a pixelated 2D
surface.
-Labeling the major axis of an ellipse @mymath{a}, and its minor axis with
@mymath{b}, the @emph{axis ratio} is defined as: @mymath{q\equiv b/a}.
-The major axis of an ellipse can be aligned in any direction, therefore the
angle of the major axis with respect to the horizontal axis of the image is
defined to be the @emph{position angle} of the ellipse and in this book, we
show it with @mymath{\theta}.
+## Match in 3D (RA, Dec and Wavelength).
+$ astmatch --ccol1=2,3,4 --ccol2=2,3,4 -a0.5/3600,0.5/3600,5e-10 \
+ in1.fits in2.txt
+@end example
-@cindex Radial profile on ellipse
-Our aim is to put a radial profile of any functional form @mymath{f(r)} over
an ellipse.
-Hence we need to associate a radius/distance to every point in space.
-Let's define the radial distance @mymath{r_{el}} as the distance on the major
axis to the center of an ellipse which is located at @mymath{i_c} and
@mymath{j_c} (in other words @mymath{r_{el}\equiv{a}}).
-We want to find @mymath{r_{el}} of a point located at @mymath{(i,j)} (in the
image coordinate system) from the center of the ellipse with axis ratio
@mymath{q} and position angle @mymath{\theta}.
-First the coordinate system is rotated@footnote{Do not confuse the signs of
@mymath{sin} with the rotation matrix defined in @ref{Linear warping basics}.
-In that equation, the point is rotated, here the coordinates are rotated and
the point is fixed.} by @mymath{\theta} to get the new rotated coordinates of
that point @mymath{(i_r,j_r)}:
+Match will find the rows that are nearest to each other in two catalogs (given
some coordinate columns).
+Alternatively, it can construct the k-d tree of one catalog to save in a FITS
file for future matching of the same catalog with many others.
+To understand the inner working of Match and its algorithms, see @ref{Matching
algorithms}.
-@dispmath{i_r(i,j)=+(i_c-i)\cos\theta+(j_c-j)\sin\theta}
-@dispmath{j_r(i,j)=-(i_c-i)\sin\theta+(j_c-j)\cos\theta}
+When matching, two catalogs are necessary for input.
+But for constructing a k-d tree, only a single catalog should be given.
+The input tables can be plain text tables or FITS tables, for more see
@ref{Tables}.
+But other ways of feeding inputs area also supported:
+@itemize
+@item
+The @emph{first} catalog can also come from the standard input (for example, a
pipe that feeds the output of a previous command to Match, see @ref{Standard
input});
+@item
+When you only want to match one point with another catalog, you can use the
@option{--coord} option to avoid creating a file for the @emph{second} input
catalog.
+@end itemize
-@cindex Elliptical distance
-@noindent Recall that an ellipse is defined by @mymath{(i_r/a)^2+(j_r/b)^2=1}
and that we defined @mymath{r_{el}\equiv{a}}.
-Hence, multiplying all elements of the ellipse definition with
@mymath{r_{el}^2} we get the elliptical distance at this point point located:
@mymath{r_{el}=\sqrt{i_r^2+(j_r/q)^2}}.
-To place the radial profiles explained below over an ellipse,
@mymath{f(r_{el})} is calculated based on the functional radial profile desired.
+Match follows the same basic behavior of all Gnuastro programs as fully
described in @ref{Common program behavior}.
+If the first input is a FITS file, the common @option{--hdu} option (see
@ref{Input output options}) should be used to identify the extension.
+When the second input is FITS, the extension must be specified with
@option{--hdu2}.
-@cindex Ellipsoid
-@cindex Euler angles
-An ellipse in 3D, or an @url{https://en.wikipedia.org/wiki/Ellipsoid,
ellipsoid}, can be defined following similar principles as before.
-Labeling the major (largest) axis length as @mymath{a}, the second and third
(in a right-handed coordinate system) axis lengths can be labeled as @mymath{b}
and @mymath{c}.
-Hence we have two axis ratios: @mymath{q_1\equiv{b/a}} and
@mymath{q_2\equiv{c/a}}.
-The orientation of the ellipsoid can be defined from the orientation of its
major axis.
-There are many ways to define 3D orientation and order matters.
-So to be clear, here we use the ZXZ (or @mymath{Z_1X_2Z_3}) proper
@url{https://en.wikipedia.org/wiki/Euler_angles, Euler angles} to define the 3D
orientation.
-In short, when a point is rotated in this order, we first rotate it around the
Z axis (third axis) by @mymath{\alpha}, then about the (rotated) X axis by
@mymath{\beta} and finally about the (rotated) Z axis by @mymath{\gamma}.
+When @option{--quiet} is not called, Match will print its various processing
phases (including the number of matches found) in standard output (on the
command-line).
+When matches are found, by default, two tables will be output (if in FITS
format, as two HDUs).
+Each output table will contain the re-arranged rows of the respective input
table.
+In other words, both tables will have the same number of rows, and row N in
both corresponds to the 10th match between the two.
+If no matches are found, the columns of the output table(s) will have zero
rows (with proper meta-data).
+The output format can be changed with the following options:
+@itemize
+@item
+@option{--outcols}: The output will be a single table with rows chosen from
either of the two inputs in any order.
+@item
+@option{--notmatched}: The output tables will contain the rows that did not
match between the two tables.
+If called with @option{--outcols}, the output will be a single table with all
non-matched rows of both tables.
+@item
+@option{--logasoutput}: The output will be a single table with the contents of
the log file, see below.
+@end itemize
-Following the discussion in @ref{Merging multiple warpings}, we can define the
full rotation with the following matrix multiplication.
-However, here we are rotating the coordinates, not the point.
-Therefore, both the rotation angles and rotation order are reversed.
-We are also not using homogeneous coordinates (see @ref{Linear warping
basics}) since we are not concerned with translation in this context:
+If no output file name is given with the @option{--output} option, then
automatic output @ref{Automatic output} will be used to determine the output
name(s).
+Depending on @option{--tableformat} (see @ref{Input output options}), the
output will be a (possibly multi-extension) FITS file or (possibly two) plain
text file(s).
+Generally, giving a filename to @option{--output} is recommended.
-@dispmath{\left[\matrix{i_r\cr j_r\cr k_r}\right] =
- \left[\matrix{cos\gamma&sin\gamma&0\cr -sin\gamma&cos\gamma&0\cr
0&0&1}\right]
- \left[\matrix{1&0&0\cr 0&cos\beta&sin\beta\cr
0&-sin\beta&cos\beta }\right]
- \left[\matrix{cos\alpha&sin\alpha&0\cr -sin\alpha&cos\alpha&0\cr
0&0&1}\right]
- \left[\matrix{i_c-i\cr j_c-j\cr k_c-k}\right] }
+When the @option{--log} option is called (see @ref{Operating mode options}),
and there was a match, Match will also create a file named @file{astmatch.fits}
(or @file{astmatch.txt}, depending on @option{--tableformat}, see @ref{Input
output options}) in the directory it is run in.
+This log table will have three columns.
+The first and second columns show the matching row/record number (counting
from 1) of the first and second input catalogs respectively.
+The third column is the distance between the two matched positions.
+The units of the distance are the same as the given coordinates (given the
possible ellipticity, see description of @option{--aperture} below).
+When @option{--logasoutput} is called, no log file (with a fixed name) will be
created.
+In this case, the output file (possibly given by the @option{--output} option)
will have the contents of this log file.
+@cartouche
@noindent
-Recall that an ellipsoid can be characterized with
-@mymath{(i_r/a)^2+(j_r/b)^2+(k_r/c)^2=1}, so similar to before
-(@mymath{r_{el}\equiv{a}}), we can find the ellipsoidal radius at pixel
-@mymath{(i,j,k)} as: @mymath{r_{el}=\sqrt{i_r^2+(j_r/q_1)^2+(k_r/q_2)^2}}.
-
-@cindex Breadth first search
-@cindex Inside-out construction
-@cindex Making profiles pixel by pixel
-@cindex Pixel by pixel making of profiles
-MakeProfiles builds the profile starting from the nearest element (pixel in an
image) in the dataset to the profile center.
-The profile value is calculated for that central pixel using Monte Carlo
integration, see @ref{Sampling from a function}.
-The next pixel is the next nearest neighbor to the central pixel as defined by
@mymath{r_{el}}.
-This process goes on until the profile is fully built upto the truncation
radius.
-This is done fairly efficiently using a breadth first parsing
strategy@footnote{@url{http://en.wikipedia.org/wiki/Breadth-first_search}}
which is implemented through an ordered linked list.
-
-Using this approach, we build the profile by expanding the circumference.
-Not one more extra pixel has to be checked (the calculation of @mymath{r_{el}}
from above is not cheap in CPU terms).
-Another consequence of this strategy is that extending MakeProfiles to three
dimensions becomes very simple: only the neighbors of each pixel have to be
changed.
-Everything else after that (when the pixel index and its radial profile have
entered the linked list) is the same, no matter the number of dimensions we are
dealing with.
-
+@strong{@option{--log} is not thread-safe}: As described above, when
@option{--logasoutput} is not called, the Log file has a fixed name for all
calls to Match.
+Therefore if a separate log is requested in two simultaneous calls to Match in
the same directory, Match will try to write to the same file.
+This will cause problems like unreasonable log file, undefined behavior, or a
crash.
+Remember that @option{--log} is mainly intended for debugging purposes, if you
want the log file with a specific name, simply use @option{--logasoutput}
(which will also be faster, since no arranging of the input columns is
necessary).
+@end cartouche
+@table @option
+@item -H STR
+@itemx --hdu2=STR
+The extension/HDU of the second input if it is a FITS file.
+When it is not a FITS file, this option's value is ignored.
+For the first input, the common option @option{--hdu} must be used.
+@item -k STR
+@itemx --kdtree=STR
+Select the algorithm and/or the way to construct or import the k-d tree.
+A summary of the four acceptable strings for this option are described here
for completeness.
+However, for a much more detailed discussion on Match's algorithms with
examples, see @ref{Matching algorithms}.
+@table @code
+@item internal
+Construct a k-d tree for the first input internally (within the same run of
Match), and parallelize over the rows of the second to find the nearest points.
+This is the default algorithm/method used by Match (when this option is not
called).
+@item build
+Only construct a k-d tree of a single input and abort.
+The name of the k-d tree is value to @option{--output}.
+@item CUSTOM-FITS-FILE
+Use the given FITS file as a k-d tree (that was previously constructed with
Match itself) of the first input, and do not construct any k-d tree internally.
+The FITS file should have two columns with an unsigned 32-bit integer data
type and a @code{KDTROOT} keyword that contains the index of the root of the
k-d tree.
+For more on Gnuastro's k-d tree format, see @ref{K-d tree}.
+@item disable
+Do Not use the k-d tree algorithm for finding the nearest neighbor, instead,
use the sort-based method.
+@end table
+@item --kdtreehdu=STR
+The HDU of the FITS file, when a FITS file is given to the @option{--kdtree}
option that was described above.
-@node PSF, Stars, Defining an ellipse and ellipsoid, Modeling basics
-@subsubsection Point spread function
+@item --outcols=STR[,STR,[...]]
+Columns (from both inputs) to write into a single matched table output.
+The value to @code{--outcols} must be a comma-separated list of column
identifiers (number or name, see @ref{Selecting table columns}).
+The expected format depends on @option{--notmatched} and explained below.
+By default (when @option{--nomatched} is not called), the number of rows in
the output will be equal to the number of matches.
+However, when @option{--notmatched} is called, all the rows (from the
requested columns) of the first input are placed in the output, and the
not-matched rows of the second input are inserted afterwards (useful when you
want to merge unique entries of multiple catalogs into one).
-@cindex PSF
-@cindex Point source
-@cindex Diffraction limited
-@cindex Point spread function
-@cindex Spread of a point source
-Assume we have a `point' source, or a source that is far smaller than the
maximum resolution (a pixel).
-When we take an image of it, it will `spread' over an area.
-To quantify that spread, we can define a `function'.
-This is how the ``point spread function'' or the PSF of an image is defined.
+@table @asis
+@item Default (only matching rows)
+The first character of each string specifies the input catalog: @option{a} for
the first and @option{b} for the second.
+The rest of the characters of the string will be directly used to identify the
proper column(s) in the respective table.
+See @ref{Selecting table columns} for how columns can be specified in Gnuastro.
-This `spread' can have various causes, for example, in ground-based astronomy,
due to the atmosphere.
-In practice we can never surpass the `spread' due to the diffraction of the
telescope aperture (even in Space!).
-Various other effects can also be quantified through a PSF.
-For example, the simple fact that we are sampling in a discrete space, namely
the pixels, also produces a very small `spread' in the image.
+For example, the output of @option{--outcols=a1,bRA,bDEC} will have three
columns: the first column of the first input, along with the @option{RA} and
@option{DEC} columns of the second input.
-@cindex Blur image
-@cindex Convolution
-@cindex Image blurring
-@cindex PSF image size
-Convolution is the mathematical process by which we can apply a `spread' to an
image, or in other words blur the image, see @ref{Convolution process}.
-The sum of pixels of an image should remain unchanged after convolution.
-Therefore, it is important that the sum of all the pixels of the PSF be unity.
-The PSF image also has to have an odd number of pixels on its sides so one
pixel can be defined as the center.
+If the string after @option{a} or @option{b} is @option{_all}, then all the
columns of the respective input file will be written in the output.
+For example, the command below will print all the input columns from the first
catalog along with the 5th column from the second:
-In MakeProfiles, the PSF can be set by the two methods explained below:
+@example
+$ astmatch a.fits b.fits --outcols=a_all,b5
+@end example
-@table @asis
+@code{_all} can be used multiple times, possibly on both inputs.
+Tip: if an input's column is called @code{_all} (an unlikely name!) and you do
not want all the columns from that table the output, use its column number to
avoid confusion.
-@item Parametric functions
-@cindex FWHM
-@cindex PSF width
-@cindex Parametric PSFs
-@cindex Full Width at Half Maximum
-A known mathematical function is used to make the PSF.
-In this case, only the parameters to define the functions are necessary and
MakeProfiles will make a PSF based on the given parameters for each function.
-In both cases, the center of the profile has to be exactly in the middle of
the central pixel of the PSF (which is automatically done by MakeProfiles).
-When talking about the PSF, usually, the full width at half maximum or FWHM is
used as a scale of the width of the PSF.
+Another example is given in the one-line examples above.
+Compared to the default case (where two tables with all their columns) are
saved separately, using this option is much faster: it will only read and
re-arrange the necessary columns and it will write a single output table.
+Combined with regular expressions in large tables, this can be a very powerful
and convenient way to merge various tables into one.
-@table @cite
-@item Gaussian
-@cindex Gaussian distribution
-In the older papers, and to a lesser extent even today, some researchers use
the 2D Gaussian function to approximate the PSF of ground based images.
-In its most general form, a Gaussian function can be written as:
+When @option{--coord} is given, no second catalog will be read.
+The second catalog will be created internally based on the values given to
@option{--coord}.
+So column names are not defined and you can only request integer column
numbers that are less than the number of coordinates given to @option{--coord}.
+For example, if you want to find the row matching RA of 1.2345 and Dec of
6.7890, then you should use @option{--coord=1.2345,6.7890}.
+But when using @option{--outcols}, you cannot give @code{bRA}, or @code{b25}.
-@dispmath{f(r)=a \exp \left( -(x-\mu)^2 \over 2\sigma^2 \right)+d}
+@item With @option{--notmatched}
+Only the column names/numbers should be given (for example,
@option{--outcols=RA,DEC,MAGNITUDE}).
+It is assumed that both input tables have the requested column(s) and that the
numerical data types of each column in each input (with same name) is the same
as the corresponding column in the other.
+Therefore if one input has a @code{MAGNITUDE} column with a 32-bit floating
point type, but the @code{MAGNITUDE} column of the other is 64-bit floating
point, Match will crash with an error.
+The metadata of the columns will come from the first input.
-Since the center of the profile is pre-defined, @mymath{\mu} and @mymath{d}
are constrained.
-@mymath{a} can also be found because the function has to be normalized.
-So the only important parameter for MakeProfiles is the @mymath{\sigma}.
-In the Gaussian function we have this relation between the FWHM and
@mymath{\sigma}:
+As an example, let's assume @file{input1.txt} and @file{input2.fits} each have
a different number of columns and rows.
+However, they both have the @code{RA} (64-bit floating point), @code{DEC}
(64-bit floating point) and @code{MAGNITUDE} (32-bit floating point) columns.
+If @file{input1.txt} has 100 rows and @file{input2.fits} has 300 rows (such
that 50 of them match within 1 arcsec of the first), then the output of the
command above will have @mymath{100+(300-50)=350} rows and only three columns.
+Other columns in each catalog, which may be different, are ignored.
-@cindex Gaussian FWHM
-@dispmath{\rm{FWHM}_g=2\sqrt{2\ln{2}}\sigma \approx 2.35482\sigma}
+@example
+$ astmatch input1.txt --ccol1=RA,DEC \
+ input2.fits --ccol2=RA,DEC \
+ --aperture=1/3600 \
+ --notmatched --outcols=RA,DEC,MAGNITUDE
+@end example
+@end table
-@item Moffat
-@cindex Moffat function
-The Gaussian profile is much sharper than the images taken from stars on
photographic plates or CCDs.
-Therefore in 1969, Moffat proposed this functional form for the image of stars:
+@item -l
+@itemx --logasoutput
+The output file will have the contents of the log file: indexes in the two
catalogs that match with each other along with their distance, see description
of the log file above.
-@dispmath{f(r)=a \left[ 1+\left( r\over \alpha \right)^2 \right]^{-\beta}}
+When this option is called, a separate log file will not be created and the
output will not contain any of the input columns (either as two tables
containing the re-arranged columns of each input, or a single table mixing
columns), only their indices in the log format.
-@cindex Moffat beta
-Again, @mymath{a} is constrained by the normalization, therefore two
parameters define the shape of the Moffat function: @mymath{\alpha} and
@mymath{\beta}.
-The radial parameter is @mymath{\alpha} which is related to the FWHM by
+@item --notmatched
+Write the non-matching rows into the outputs, not the matched ones.
+By default, this will produce two output tables, that will not necessarily
have the same number of rows.
+However, when called with @option{--outcols}, it is possible to import
non-matching rows of the second into the first.
+See the description of @option{--outcols} for more.
-@cindex Moffat FWHM
-@dispmath{\rm{FWHM}_m=2\alpha\sqrt{2^{1/\beta}-1}}
+@item -c INT/STR[,INT/STR]
+@itemx --ccol1=INT/STR[,INT/STR]
+The coordinate columns of the first input.
+The number of dimensions for the match is determined by the number of
comma-separated values given to this option.
+The values can be the column number (counting from 1), exact column name or a
regular expression.
+For more, see @ref{Selecting table columns}.
+See the one-line examples above for some usages of this option.
-@cindex Compare Moffat and Gaussian
-@cindex PSF, Moffat compared Gaussian
-@noindent
-Comparing with the PSF predicted from atmospheric turbulence theory with a
Moffat function, Trujillo et al.@footnote{
-Trujillo, I., J. A. L. Aguerri, J. Cepa, and C. M. Gutierrez (2001). ``The
effects of seeing on S@'ersic profiles - II. The Moffat PSF''. In: MNRAS 328,
pp. 977---985.}
-claim that @mymath{\beta} should be 4.765.
-They also show how the Moffat PSF contains the Gaussian PSF as a limiting case
when @mymath{\beta\to\infty}.
+@item -C INT/STR[,INT/STR]
+@itemx --ccol2=INT/STR[,INT/STR]
+The coordinate columns of the second input.
+See the example in @option{--ccol1} for more.
-@end table
+@item -d FLT[,FLT]
+@itemx --coord=FLT[,FLT]
+Manually specify the coordinates to match against the given catalog.
+With this option, Match will not look for a second input file/table and will
directly use the coordinates given to this option.
+When the coordinates are RA and Dec, the comma-separated values can either be
in degrees (a single number), or sexagesimal (@code{_h_m_} for RA, @code{_d_m_}
for Dec, or @code{_:_:_} for both).
-@item An input FITS image
-An input image file can also be specified to be used as a PSF.
-If the sum of its pixels are not equal to 1, the pixels will be multiplied by
a fraction so the sum does become 1.
+When this option is called, the output changes in the following ways:
+1) when @option{--outcols} is specified, for the second input, it can only
accept integer numbers that are less than the number of values given to this
option, see description of that option for more.
+2) By default (when @option{--outcols} is not used), only the matching row of
the first table will be output (a single file), not two separate files (one for
each table).
-Gnuastro has tools to extract the non-parametric (extended) PSF of any image
as a FITS file (assuming there are a sufficient number of stars in it), see
@ref{Building the extended PSF}.
-This method is not perfect (will have noise if you do not have many stars),
but it is the actual PSF of the data that is not forced into any parametric
form.
-@end table
+This option is good when you have a (large) catalog and only want to match a
single coordinate to it (for example, to find the nearest catalog entry to your
desired point).
+With this option, you can write the coordinates on the command-line and thus
avoid the need to make a single-row file.
-While the Gaussian is only dependent on the FWHM, the Moffat function is also
dependent on @mymath{\beta}.
-Comparing these two functions with a fixed FWHM gives the following results:
+@item -a FLT[,FLT[,FLT]]
+@itemx --aperture=FLT[,FLT[,FLT]]
+Parameters of the aperture for matching.
+The values given to this option can be fractions, for example, when the
position columns are in units of degrees, @option{1/3600} can be used to ask
for one arc-second.
+The interpretation of the values depends on the requested dimensions
(determined from @option{--ccol1} and @code{--ccol2}) and how many values are
given to this option.
-@itemize
-@item
-Within the FWHM, the functions do not have significant differences.
-@item
-For a fixed FWHM, as @mymath{\beta} increases, the Moffat function becomes
sharper.
-@item
-The Gaussian function is much sharper than the Moffat functions, even when
@mymath{\beta} is large.
-@end itemize
+When multiple objects are found within the aperture, the match is defined
+as the nearest one. In a multi-dimensional dataset, when the aperture is a
+general ellipse or ellipsoid (and not a circle or sphere), the distance is
+calculated in the elliptical space along the major axis. For the defintion
+of this distance, see @mymath{r_{el}} in @ref{Defining an ellipse and
+ellipsoid}.
+@table @asis
+@item 1D match
+The aperture/interval can only take one value: half of the interval around
each point (maximum distance from each point).
+@item 2D match
+In a 2D match, the aperture can be a circle, an ellipse aligned in the axes or
an ellipse with a rotated major axis.
+To simply the usage, you can determine the shape based on the number of free
parameters for each.
+@table @asis
+@item 1 number
+for example, @option{--aperture=2}.
+The aperture will be a circle of the given radius.
+The value will be in the same units as the columns in @option{--ccol1} and
@option{--ccol2}).
-@node Stars, Galaxies, PSF, Modeling basics
-@subsubsection Stars
+@item 2 numbers
+for example, @option{--aperture=3,4e-10}.
+The aperture will be an ellipse (if the two numbers are different) with the
respective value along each dimension.
+The numbers are in units of the first and second axis.
+In the example above, the semi-axis value along the first axis will be 3 (in
units of the first coordinate) and along the second axis will be
@mymath{4\times10^{-10}} (in units of the second coordinate).
+Such values can happen if you are comparing catalogs of a spectra for example.
+If more than one object exists in the aperture, the nearest will be found
along the major axis as described in @ref{Defining an ellipse and ellipsoid}.
-@cindex Modeling stars
-@cindex Stars, modeling
-In MakeProfiles, stars are generally considered to be a point source.
-This is usually the case for extra galactic studies, where nearby stars are
also in the field.
-Since a star is only a point source, we assume that it only fills one pixel
prior to convolution.
-In fact, exactly for this reason, in astronomical images the light profiles of
stars are one of the best methods to understand the shape of the PSF and a very
large fraction of scientific research is preformed by assuming the shapes of
stars to be the PSF of the image.
+@item 3 numbers
+for example, @option{--aperture=2,0.6,30}.
+The aperture will be an ellipse (if the second value is not 1).
+The first number is the semi-major axis, the second is the axis ratio and the
third is the position angle (in degrees).
+If multiple matches are found within the ellipse, the distance (to find the
nearest) is calculated along the major axis in the elliptical space, see
@ref{Defining an ellipse and ellipsoid}.
+@end table
+@item 3D match
+The aperture (matching volume) can be a sphere, an ellipsoid aligned on the
three axes or a genenral ellipsoid rotated in any direction.
+To simplifythe usage, the shape can be determined based on the number of
values given to this option.
+@table @asis
+@item 1 number
+for example, @option{--aperture=3}.
+The matching volume will be a sphere of the given radius.
+The value is in the same units as the input coordinates.
+@item 3 numbers
+for example, @option{--aperture=4,5,6e-10}.
+The aperture will be a general ellipsoid with the respective extent along each
dimension.
+The numbers must be in the same units as each axis.
+This is very similar to the two number case of 2D inputs.
+See there for more.
+@item 6 numbers
+for example, @option{--aperture=4,0.5,0.6,10,20,30}.
+The numbers represent the full general ellipsoid definition (in any
orientation).
+For the definition of a general ellipsoid, see @ref{Defining an ellipse and
ellipsoid}.
+The first number is the semi-major axis.
+The second and third are the two axis ratios.
+The last three are the three Euler angles in units of degrees in the ZXZ order
as fully described in @ref{Defining an ellipse and ellipsoid}.
+@end table
-@node Galaxies, Sampling from a function, Stars, Modeling basics
-@subsubsection Galaxies
+@end table
+@end table
-@cindex Galaxy profiles
-@cindex S@'ersic profile
-@cindex Profiles, galaxies
-@cindex Generalized de Vaucouleur profile
-Today, most practitioners agree that the flux of galaxies can be modeled with
one or a few generalized de Vaucouleur's (or S@'ersic) profiles.
-@dispmath{I(r) = I_e \exp \left ( -b_n \left[ \left( r \over r_e \right)^{1/n}
-1 \right] \right )}
-@cindex Brightness
-@cindex S@'ersic, J. L.
-@cindex S@'ersic index
-@cindex Effective radius
-@cindex Radius, effective
-@cindex de Vaucouleur profile
-@cindex G@'erard de Vaucouleurs
-G@'erard de Vaucouleurs (1918-1995) was first to show in 1948 that this
function resembles the galaxy light profiles, with the only difference that he
held @mymath{n} fixed to a value of 4.
-Twenty years later in 1968, J. L. S@'ersic showed that @mymath{n} can have a
variety of values and does not necessarily need to be 4.
-This profile depends on the effective radius (@mymath{r_e}) which is defined
as the radius which contains half of the profile's 2-dimensional integral to
infinity (see @ref{Profile magnitude}).
-@mymath{I_e} is the flux at the effective radius.
-The S@'ersic index @mymath{n} is used to define the concentration of the
profile within @mymath{r_e} and @mymath{b_n} is a constant dependent on
@mymath{n}.
-MacArthur et al.@footnote{MacArthur, L. A., S. Courteau, and J. A. Holtzman
(2003). ``Structure of Disk-dominated Galaxies. I. Bulge/Disk Parameters,
Simulations, and Secular Evolution''. In: ApJ 582, pp. 689---722.} show that
for @mymath{n>0.35}, @mymath{b_n} can be accurately approximated using this
equation:
-@dispmath{b_n=2n - {1\over 3} + {4\over 405n} + {46\over 25515n^2} + {131\over
1148175n^3}-{2194697\over 30690717750n^4}}
-@node Sampling from a function, Oversampling, Galaxies, Modeling basics
-@subsubsection Sampling from a function
-@cindex Sampling
-A pixel is the ultimate level of accuracy to gather data, we cannot get any
more accurate in one image, this is known as sampling in signal processing.
-However, the mathematical profiles which describe our models have infinite
accuracy.
-Over a large fraction of the area of astrophysically interesting profiles (for
example, galaxies or PSFs), the variation of the profile over the area of one
pixel is not too significant.
-In such cases, the elliptical radius (@mymath{r_{el}}) of the center of the
pixel can be assigned as the final value of the pixel, (see @ref{Defining an
ellipse and ellipsoid}).
-@cindex Integration over pixel
-@cindex Gradient over pixel area
-@cindex Function gradient over pixel area
-As you approach their center, some galaxies become very sharp (their value
significantly changes over one pixel's area).
-This sharpness increases with smaller effective radius and larger S@'ersic
values.
-Thus rendering the central value extremely inaccurate.
-The first method that comes to mind for solving this problem is integration.
-The functional form of the profile can be integrated over the pixel area in a
2D integration process.
-However, unfortunately numerical integration techniques also have their
limitations and when such sharp profiles are needed they can become extremely
inaccurate.
-@cindex Monte carlo integration
-The most accurate method of sampling a continuous profile on a discrete space
is by choosing a large number of random points within the boundaries of the
pixel and taking their average value (or Monte Carlo integration).
-This is also, generally speaking, what happens in practice with the photons on
the pixel.
-The number of random points can be set with @option{--numrandom}.
-Unfortunately, repeating this Monte Carlo process would be extremely time and
CPU consuming if it is to be applied to every pixel.
-In order to not loose too much accuracy, in MakeProfiles, the profile is built
using both methods explained below.
-The building of the profile begins from its central pixel and continues
(radially) outwards.
-Monte Carlo integration is first applied (which yields @mymath{F_r}), then the
central pixel value (@mymath{F_c}) is calculated on the same pixel.
-If the fractional difference (@mymath{|F_r-F_c|/F_r}) is lower than a given
tolerance level (specified with @option{--tolerance}) MakeProfiles will stop
using Monte Carlo integration and only use the central pixel value.
-@cindex Inside-out construction
-The ordering of the pixels in this inside-out construction is based on
@mymath{r=\sqrt{(i_c-i)^2+(j_c-j)^2}}, not @mymath{r_{el}}, see @ref{Defining
an ellipse and ellipsoid}.
-When the axis ratios are large (near one) this is fine.
-But when they are small and the object is highly elliptical, it might seem
more reasonable to follow @mymath{r_{el}} not @mymath{r}.
-The problem is that the gradient is stronger in pixels with smaller @mymath{r}
(and larger @mymath{r_{el}}) than those with smaller @mymath{r_{el}}.
-In other words, the gradient is strongest along the minor axis.
-So if the next pixel is chosen based on @mymath{r_{el}}, the tolerance level
will be reached sooner and lots of pixels with large fractional differences
will be missed.
-Monte Carlo integration uses a random number of points.
-Thus, every time you run it, by default, you will get a different distribution
of points to sample within the pixel.
-In the case of large profiles, this will result in a slight difference of the
pixels which use Monte Carlo integration each time MakeProfiles is run.
-To have a deterministic result, you have to fix the random number generator
properties which is used to build the random distribution.
-This can be done by setting the @code{GSL_RNG_TYPE} and @code{GSL_RNG_SEED}
environment variables and calling MakeProfiles with the @option{--envseed}
option.
-To learn more about the process of generating random numbers, see
@ref{Generating random numbers}.
-@cindex Seed, Random number generator
-@cindex Random number generator, Seed
-The seed values are fixed for every profile: with @option{--envseed}, all the
profiles have the same seed and without it, each will get a different seed
using the system clock (which is accurate to within one microsecond).
-The same seed will be used to generate a random number for all the sub-pixel
positions of all the profiles.
-So in the former, the sub-pixel points checked for all the pixels undergoing
Monte carlo integration in all profiles will be identical.
-In other words, the sub-pixel points in the first (closest to the center)
pixel of all the profiles will be identical with each other.
-All the second pixels studied for all the profiles will also receive an
identical (different from the first pixel) set of sub-pixel points and so on.
-As long as the number of random points used is large enough or the profiles
are not identical, this should not cause any systematic bias.
-@node Oversampling, , Sampling from a function, Modeling basics
-@subsubsection Oversampling
-@cindex Oversampling
-The steps explained in @ref{Sampling from a function} do give an accurate
representation of a profile prior to convolution.
-However, in an actual observation, the image is first convolved with or
blurred by the atmospheric and instrument PSF in a continuous space and then it
is sampled on the discrete pixels of the camera.
-@cindex PSF over-sample
-In order to more accurately simulate this process, the unconvolved image and
the PSF are created on a finer pixel grid.
-In other words, the output image is a certain odd-integer multiple of the
desired size, we can call this `oversampling'.
-The user can specify this multiple as a command-line option.
-The reason this has to be an odd number is that the PSF has to be centered on
the center of its image.
-An image with an even number of pixels on each side does not have a central
pixel.
+@node Data modeling, High-level calculations, Data analysis, Top
+@chapter Data modeling
-The image can then be convolved with the PSF (which should also be oversampled
on the same scale).
-Finally, image can be sub-sampled to get to the initial desired pixel size of
the output image.
-After this, mock noise can be added as explained in the next section.
-This is because unlike the PSF, the noise occurs in each output pixel, not on
a continuous space like all the prior steps.
+@cindex Fitting
+@cindex Modeling
+In order to fully understand observations after initial analysis on the image,
it is very important to compare them with the existing models to be able to
further understand both the models and the data.
+The tools in this chapter create model galaxies and will provide 2D fittings
to be able to understand the detections.
+@menu
+* MakeProfiles:: Making mock galaxies and stars.
+* MakeNoise:: Make (add) noise to an image.
+@end menu
-@node If convolving afterwards, Profile magnitude, Modeling basics,
MakeProfiles
-@subsection If convolving afterwards
-In case you want to convolve the image later with a given point spread
function, make sure to use a larger image size.
-After convolution, the profiles become larger and a profile that is normally
completely outside of the image might fall within it.
+@node MakeProfiles, MakeNoise, Data modeling, Data modeling
+@section MakeProfiles
-On one axis, if you want your final (convolved) image to be @mymath{m} pixels
and your PSF is @mymath{2n+1} pixels wide, then when calling MakeProfiles, set
the axis size to @mymath{m+2n}, not @mymath{m}.
-You also have to shift all the pixel positions of the profile centers on the
that axis by @mymath{n} pixels to the positive.
+@cindex Checking detection algorithms
+@pindex @r{MakeProfiles (}astmkprof@r{)}
+MakeProfiles will create mock astronomical profiles from a catalog, either
individually or together in one output image.
+In data analysis, making a mock image can act like a calibration tool, through
which you can test how successfully your detection technique is able to detect
a known set of objects.
+There are commonly two aspects to detecting: the detection of the fainter
parts of bright objects (which in the case of galaxies fade into the noise very
slowly) or the complete detection of an over-all faint object.
+Making mock galaxies is the most accurate (and idealistic) way these two
aspects of a detection algorithm can be tested.
+You also need mock profiles in fitting known functional profiles with
observations.
-After convolution, you can crop the outer @mymath{n} pixels with the section
crop box specification of Crop: @option{--section=n+1:*-n,n+1:*-n} (according
to the FITS standard, counting is from 1 so we use @code{n+1}) assuming your
PSF is a square, see @ref{Crop section syntax}.
-This will also remove all discrete Fourier transform artifacts (blurred sides)
from the final image.
-To facilitate this shift, MakeProfiles has the options @option{--xshift},
@option{--yshift} and @option{--prepforconv}, see @ref{Invoking astmkprof}.
+MakeProfiles was initially built for extra galactic studies, so currently the
only astronomical objects it can produce are stars and galaxies.
+We welcome the simulation of any other astronomical object.
+The general outline of the steps that MakeProfiles takes are the following:
+@enumerate
+@item
+Build the full profile out to its truncation radius in a possibly over-sampled
array.
+@item
+Multiply all the elements by a fixed constant so its total magnitude equals
the desired total magnitude.
+@item
+If @option{--individual} is called, save the array for each profile to a FITS
file.
+@item
+If @option{--nomerged} is not called, add the overlapping pixels of all the
created profiles to the output image and abort.
+@end enumerate
+Using input values, MakeProfiles adds the World Coordinate System (WCS)
headers of the FITS standard to all its outputs (except PSF images!).
+For a simple test on a set of mock galaxies in one image, there is no need for
the third step or the WCS information.
+@cindex Transform image
+@cindex Lensing simulations
+@cindex Image transformations
+However in complicated simulations like weak lensing simulations, where each
galaxy undergoes various types of individual transformations based on their
position, those transformations can be applied to the different individual
images with other programs.
+After all the transformations are applied, using the WCS information in each
individual profile image, they can be merged into one output image for
convolution and adding noise.
+@menu
+* Modeling basics:: Astronomical modeling basics.
+* If convolving afterwards:: Considerations for convolving later.
+* Profile magnitude:: Definition of total profile magnitude.
+* Invoking astmkprof:: Inputs and Options for MakeProfiles.
+@end menu
-@node Profile magnitude, Invoking astmkprof, If convolving afterwards,
MakeProfiles
-@subsection Profile magnitude
-@cindex Truncation radius
-@cindex Sum for total flux
-To find the profile's total magnitude, (see @ref{Brightness flux magnitude}),
it is customary to use the 2D integration of the flux to infinity.
-However, in MakeProfiles we do not follow this idealistic approach and apply a
more realistic method to find the total magnitude: the sum of all the pixels
belonging to a profile within its predefined truncation radius.
-Note that if the truncation radius is not large enough, this can be
significantly different from the total integrated light to infinity.
-@cindex Integration to infinity
-An integration to infinity is not a realistic condition because no galaxy
extends indefinitely (important for high S@'ersic index profiles), pixelation
can also cause a significant difference between the actual total pixel sum
value of the profile and that of integration to infinity, especially in small
and high S@'ersic index profiles.
-To be safe, you can specify a large enough truncation radius for such compact
high S@'ersic index profiles.
+@node Modeling basics, If convolving afterwards, MakeProfiles, MakeProfiles
+@subsection Modeling basics
-If oversampling is used then the pixel value is calculated using the
over-sampled image, see @ref{Oversampling} which is much more accurate.
-The profile is first built in an array completely bounding it with a
normalization constant of unity (see @ref{Galaxies}).
-Taking @mymath{V} to be the desired pixel value and @mymath{S} to be the sum
of the pixels in the created profile, every pixel is then multiplied by
@mymath{V/S} so the sum is exactly @mymath{V}.
+In the subsections below, first a review of some very basic information and
concepts behind modeling a real astronomical image is given.
+You can skip this subsection if you are already sufficiently familiar with
these concepts.
-If the @option{--individual} option is called, this same array is written to a
FITS file.
-If not, only the overlapping pixels of this array and the output image are
kept and added to the output array.
+@menu
+* Defining an ellipse and ellipsoid:: Definition of these important shapes.
+* PSF:: Radial profiles for the PSF.
+* Stars:: Making mock star profiles.
+* Galaxies:: Radial profiles for galaxies.
+* Sampling from a function:: Sample a function on a pixelated canvas.
+* Oversampling:: Oversampling the model.
+@end menu
+@node Defining an ellipse and ellipsoid, PSF, Modeling basics, Modeling basics
+@subsubsection Defining an ellipse and ellipsoid
+@cindex Ellipse
+@cindex Axis ratio
+@cindex Position angle
+The PSF, see @ref{PSF}, and galaxy radial profiles are generally defined on an
ellipse.
+Therefore, in this section we will start defining an ellipse on a pixelated 2D
surface.
+Labeling the major axis of an ellipse @mymath{a}, and its minor axis with
@mymath{b}, the @emph{axis ratio} is defined as: @mymath{q\equiv b/a}.
+The major axis of an ellipse can be aligned in any direction, therefore the
angle of the major axis with respect to the horizontal axis of the image is
defined to be the @emph{position angle} of the ellipse and in this book, we
show it with @mymath{\theta}.
+@cindex Radial profile on ellipse
+Our aim is to put a radial profile of any functional form @mymath{f(r)} over
an ellipse.
+Hence we need to associate a radius/distance to every point in space.
+Let's define the radial distance @mymath{r_{el}} as the distance on the major
axis to the center of an ellipse which is located at @mymath{i_c} and
@mymath{j_c} (in other words @mymath{r_{el}\equiv{a}}).
+We want to find @mymath{r_{el}} of a point located at @mymath{(i,j)} (in the
image coordinate system) from the center of the ellipse with axis ratio
@mymath{q} and position angle @mymath{\theta}.
+First the coordinate system is rotated@footnote{Do not confuse the signs of
@mymath{sin} with the rotation matrix defined in @ref{Linear warping basics}.
+In that equation, the point is rotated, here the coordinates are rotated and
the point is fixed.} by @mymath{\theta} to get the new rotated coordinates of
that point @mymath{(i_r,j_r)}:
+@dispmath{i_r(i,j)=+(i_c-i)\cos\theta+(j_c-j)\sin\theta}
+@dispmath{j_r(i,j)=-(i_c-i)\sin\theta+(j_c-j)\cos\theta}
+@cindex Elliptical distance
+@noindent Recall that an ellipse is defined by @mymath{(i_r/a)^2+(j_r/b)^2=1}
and that we defined @mymath{r_{el}\equiv{a}}.
+Hence, multiplying all elements of the ellipse definition with
@mymath{r_{el}^2} we get the elliptical distance at this point point located:
@mymath{r_{el}=\sqrt{i_r^2+(j_r/q)^2}}.
+To place the radial profiles explained below over an ellipse,
@mymath{f(r_{el})} is calculated based on the functional radial profile desired.
-@node Invoking astmkprof, , Profile magnitude, MakeProfiles
-@subsection Invoking MakeProfiles
+@cindex Ellipsoid
+@cindex Euler angles
+An ellipse in 3D, or an @url{https://en.wikipedia.org/wiki/Ellipsoid,
ellipsoid}, can be defined following similar principles as before.
+Labeling the major (largest) axis length as @mymath{a}, the second and third
(in a right-handed coordinate system) axis lengths can be labeled as @mymath{b}
and @mymath{c}.
+Hence we have two axis ratios: @mymath{q_1\equiv{b/a}} and
@mymath{q_2\equiv{c/a}}.
+The orientation of the ellipsoid can be defined from the orientation of its
major axis.
+There are many ways to define 3D orientation and order matters.
+So to be clear, here we use the ZXZ (or @mymath{Z_1X_2Z_3}) proper
@url{https://en.wikipedia.org/wiki/Euler_angles, Euler angles} to define the 3D
orientation.
+In short, when a point is rotated in this order, we first rotate it around the
Z axis (third axis) by @mymath{\alpha}, then about the (rotated) X axis by
@mymath{\beta} and finally about the (rotated) Z axis by @mymath{\gamma}.
-MakeProfiles will make any number of profiles specified in a catalog either
individually or in one image.
-The executable name is @file{astmkprof} with the following general template
+Following the discussion in @ref{Merging multiple warpings}, we can define the
full rotation with the following matrix multiplication.
+However, here we are rotating the coordinates, not the point.
+Therefore, both the rotation angles and rotation order are reversed.
+We are also not using homogeneous coordinates (see @ref{Linear warping
basics}) since we are not concerned with translation in this context:
-@example
-$ astmkprof [OPTION ...] [Catalog]
-@end example
+@dispmath{\left[\matrix{i_r\cr j_r\cr k_r}\right] =
+ \left[\matrix{cos\gamma&sin\gamma&0\cr -sin\gamma&cos\gamma&0\cr
0&0&1}\right]
+ \left[\matrix{1&0&0\cr 0&cos\beta&sin\beta\cr
0&-sin\beta&cos\beta }\right]
+ \left[\matrix{cos\alpha&sin\alpha&0\cr -sin\alpha&cos\alpha&0\cr
0&0&1}\right]
+ \left[\matrix{i_c-i\cr j_c-j\cr k_c-k}\right] }
@noindent
-One line examples:
+Recall that an ellipsoid can be characterized with
+@mymath{(i_r/a)^2+(j_r/b)^2+(k_r/c)^2=1}, so similar to before
+(@mymath{r_{el}\equiv{a}}), we can find the ellipsoidal radius at pixel
+@mymath{(i,j,k)} as: @mymath{r_{el}=\sqrt{i_r^2+(j_r/q_1)^2+(k_r/q_2)^2}}.
-@example
-## Make an image with profiles in catalog.txt (with default size):
-$ astmkprof catalog.txt
+@cindex Breadth first search
+@cindex Inside-out construction
+@cindex Making profiles pixel by pixel
+@cindex Pixel by pixel making of profiles
+MakeProfiles builds the profile starting from the nearest element (pixel in an
image) in the dataset to the profile center.
+The profile value is calculated for that central pixel using Monte Carlo
integration, see @ref{Sampling from a function}.
+The next pixel is the next nearest neighbor to the central pixel as defined by
@mymath{r_{el}}.
+This process goes on until the profile is fully built upto the truncation
radius.
+This is done fairly efficiently using a breadth first parsing
strategy@footnote{@url{http://en.wikipedia.org/wiki/Breadth-first_search}}
which is implemented through an ordered linked list.
-## Make the profiles in catalog.txt over image.fits:
-$ astmkprof --background=image.fits catalog.txt
+Using this approach, we build the profile by expanding the circumference.
+Not one more extra pixel has to be checked (the calculation of @mymath{r_{el}}
from above is not cheap in CPU terms).
+Another consequence of this strategy is that extending MakeProfiles to three
dimensions becomes very simple: only the neighbors of each pixel have to be
changed.
+Everything else after that (when the pixel index and its radial profile have
entered the linked list) is the same, no matter the number of dimensions we are
dealing with.
-## Make a Moffat PSF with FWHM 3pix, beta=2.8, truncation=5
-$ astmkprof --kernel=moffat,3,2.8,5 --oversample=1
-## Make profiles in catalog, using RA and Dec in the given column:
-$ astmkprof --ccol=RA_CENTER --ccol=DEC_CENTER --mode=wcs catalog.txt
-## Make a 1500x1500 merged image (oversampled 500x500) image along
-## with an individual image for all the profiles in catalog:
-$ astmkprof --individual --oversample 3 --mergedsize=500,500 cat.txt
-@end example
-@noindent
-The parameters of the mock profiles can either be given through a catalog
(which stores the parameters of many mock profiles, see @ref{MakeProfiles
catalog}), or the @option{--kernel} option (see @ref{MakeProfiles output
dataset}).
-The catalog can be in the FITS ASCII, FITS binary format, or plain text
formats (see @ref{Tables}).
-A plain text catalog can also be provided using the Standard input (see
@ref{Standard input}).
-The columns related to each parameter can be determined both by number, or by
match/search criteria using the column names, units, or comments, with the
options ending in @option{col}, see below.
-Without any file given to the @option{--background} option, MakeProfiles will
make a zero-valued image and build the profiles on that (its size and main WCS
parameters can also be defined through the options described in
@ref{MakeProfiles output dataset}).
-Besides the main/merged image containing all the profiles in the catalog, it
is also possible to build individual images for each profile (only enclosing
one full profile to its truncation radius) with the @option{--individual}
option.
+@node PSF, Stars, Defining an ellipse and ellipsoid, Modeling basics
+@subsubsection Point spread function
-If an image is given to the @option{--background} option, the pixels of that
image are used as the background value for every pixel hence flux value of each
profile pixel will be added to the pixel in that background value.
-You can disable this with the @code{--clearcanvas} option (which will
initialize the background to zero-valued pixels and build profiles over that).
-With the @option{--background} option, the values to all options relating to
the ``canvas'' (output size and WCS) will be ignored if specified:
@option{--oversample}, @option{--mergedsize}, @option{--prepforconv},
@option{--crpix}, @option{--crval}, @option{--cdelt}, @option{--cdelt},
@option{--pc}, @option{cunit} and @option{ctype}.
+@cindex PSF
+@cindex Point source
+@cindex Diffraction limited
+@cindex Point spread function
+@cindex Spread of a point source
+Assume we have a `point' source, or a source that is far smaller than the
maximum resolution (a pixel).
+When we take an image of it, it will `spread' over an area.
+To quantify that spread, we can define a `function'.
+This is how the ``point spread function'' or the PSF of an image is defined.
-The sections below discuss the options specific to MakeProfiles based on
context: the input catalog settings which can have many rows for different
profiles are discussed in @ref{MakeProfiles catalog}, in @ref{MakeProfiles
profile settings}, we discuss how you can set general profile settings (that
are the same for all the profiles in the catalog).
-Finally @ref{MakeProfiles output dataset} and @ref{MakeProfiles log file}
discuss the outputs of MakeProfiles and how you can configure them.
-Besides these, MakeProfiles also supports all the common Gnuastro program
options that are discussed in @ref{Common options}, so please flip through them
is well for a more comfortable usage.
+This `spread' can have various causes, for example, in ground-based astronomy,
due to the atmosphere.
+In practice we can never surpass the `spread' due to the diffraction of the
telescope aperture (even in Space!).
+Various other effects can also be quantified through a PSF.
+For example, the simple fact that we are sampling in a discrete space, namely
the pixels, also produces a very small `spread' in the image.
-When building 3D profiles, there are more degrees of freedom.
-Hence, more columns are necessary and all the values related to dimensions
(for example, size of dataset in each dimension and the WCS properties) must
also have 3 values.
-To allow having an independent set of default values for creating 3D profiles,
MakeProfiles also installs a @file{astmkprof-3d.conf} configuration file (see
@ref{Configuration files}).
-You can use this for default 3D profile values.
-For example, if you installed Gnuastro with the prefix @file{/usr/local} (the
default location, see @ref{Installation directory}), you can benefit from this
configuration file by running MakeProfiles like the example below.
-As with all configuration files, if you want to customize a given option, call
it before the configuration file.
+@cindex Blur image
+@cindex Convolution
+@cindex Image blurring
+@cindex PSF image size
+Convolution is the mathematical process by which we can apply a `spread' to an
image, or in other words blur the image, see @ref{Convolution process}.
+The sum of pixels of an image should remain unchanged after convolution.
+Therefore, it is important that the sum of all the pixels of the PSF be unity.
+The PSF image also has to have an odd number of pixels on its sides so one
pixel can be defined as the center.
-@example
-$ astmkprof --config=/usr/local/etc/astmkprof-3d.conf catalog.txt
-@end example
+In MakeProfiles, the PSF can be set by the two methods explained below:
-@cindex Shell alias
-@cindex Alias, shell
-@cindex Shell startup
-@cindex Startup, shell
-To further simplify the process, you can define a shell alias in any startup
file (for example, @file{~/.bashrc}, see @ref{Installation directory}).
-Assuming that you installed Gnuastro in @file{/usr/local}, you can add this
line to the startup file (you may put it all in one line, it is broken into two
lines here for fitting within page limits).
+@table @asis
-@example
-alias astmkprof-3d="astmkprof --config=/usr/local/etc/astmkprof-3d.conf"
-@end example
+@item Parametric functions
+@cindex FWHM
+@cindex PSF width
+@cindex Parametric PSFs
+@cindex Full Width at Half Maximum
+A known mathematical function is used to make the PSF.
+In this case, only the parameters to define the functions are necessary and
MakeProfiles will make a PSF based on the given parameters for each function.
+In both cases, the center of the profile has to be exactly in the middle of
the central pixel of the PSF (which is automatically done by MakeProfiles).
+When talking about the PSF, usually, the full width at half maximum or FWHM is
used as a scale of the width of the PSF.
-@noindent
-Using this alias, you can call MakeProfiles with the name
@command{astmkprof-3d} (instead of @command{astmkprof}).
-It will automatically load the 3D specific configuration file first, and then
parse any other arguments, options or configuration files.
-You can change the default values in this 3D configuration file by calling
them on the command-line as you do with @command{astmkprof}@footnote{Recall
that for single-invocation options, the last command-line invocation takes
precedence over all previous invocations (including those in the 3D
configuration file).
-See the description of @option{--config} in @ref{Operating mode options}.}.
+@table @cite
+@item Gaussian
+@cindex Gaussian distribution
+In the older papers, and to a lesser extent even today, some researchers use
the 2D Gaussian function to approximate the PSF of ground based images.
+In its most general form, a Gaussian function can be written as:
-Please see @ref{Sufi simulates a detection} for a very complete tutorial
explaining how one could use MakeProfiles in conjunction with other Gnuastro's
programs to make a complete simulated image of a mock galaxy.
+@dispmath{f(r)=a \exp \left( -(x-\mu)^2 \over 2\sigma^2 \right)+d}
-@menu
-* MakeProfiles catalog:: Required catalog properties.
-* MakeProfiles profile settings:: Configuration parameters for all profiles.
-* MakeProfiles output dataset:: The canvas/dataset to build profiles over.
-* MakeProfiles log file:: A description of the optional log file.
-@end menu
+Since the center of the profile is pre-defined, @mymath{\mu} and @mymath{d}
are constrained.
+@mymath{a} can also be found because the function has to be normalized.
+So the only important parameter for MakeProfiles is the @mymath{\sigma}.
+In the Gaussian function we have this relation between the FWHM and
@mymath{\sigma}:
-@node MakeProfiles catalog, MakeProfiles profile settings, Invoking astmkprof,
Invoking astmkprof
-@subsubsection MakeProfiles catalog
-The catalog containing information about each profile can be in the FITS
ASCII, FITS binary, or plain text formats (see @ref{Tables}).
-The latter can also be provided using standard input (see @ref{Standard
input}).
-Its columns can be ordered in any desired manner.
-You can specify which columns belong to which parameters using the set of
options discussed below.
-For example, through the @option{--rcol} and @option{--tcol} options, you can
specify the column that contains the radial parameter for each profile and its
truncation respectively.
-See @ref{Selecting table columns} for a thorough discussion on the values to
these options.
+@cindex Gaussian FWHM
+@dispmath{\rm{FWHM}_g=2\sqrt{2\ln{2}}\sigma \approx 2.35482\sigma}
-The value for the profile center in the catalog (the @option{--ccol} option)
can be a floating point number so the profile center can be on any sub-pixel
position.
-Note that pixel positions in the FITS standard start from 1 and an integer is
the pixel center.
-So a 2D image actually starts from the position (0.5, 0.5), which is the
bottom-left corner of the first pixel.
-When a @option{--background} image with WCS information is provided, or you
specify the WCS parameters with the respective options@footnote{The options to
set the WCS are the following: @option{--crpix}, @option{--crval},
@option{--cdelt}, @option{--cdelt}, @option{--pc}, @option{cunit} and
@option{ctype}.
-Just recall that these options are only used if @option{--background} is not
given: if the image you give to @option{--background} does not have WCS, these
options will not be used and you cannot use WCS-mode coordinates like RA or
Dec.}, you may also use RA and Dec to identify the center of each profile (see
the @option{--mode} option below).
+@item Moffat
+@cindex Moffat function
+The Gaussian profile is much sharper than the images taken from stars on
photographic plates or CCDs.
+Therefore in 1969, Moffat proposed this functional form for the image of stars:
-In MakeProfiles, profile centers do not have to be in (overlap with) the final
image.
-Even if only one pixel of the profile within the truncation radius overlaps
with the final image size, the profile is built and included in the final image.
-Profiles that are completely out of the image will not be created (unless you
explicitly ask for it with the @option{--individual} option).
-You can use the output log file (created with @option{--log} to see which
profiles were within the image, see @ref{Common options}.
+@dispmath{f(r)=a \left[ 1+\left( r\over \alpha \right)^2 \right]^{-\beta}}
-If PSF profiles (Moffat or Gaussian, see @ref{PSF}) are in the catalog and the
profiles are to be built in one image (when @option{--individual} is not used),
it is assumed they are the PSF(s) you want to convolve your created image with.
-So by default, they will not be built in the output image but as separate
files.
-The sum of pixels of these separate files will also be set to unity (1) so you
are ready to convolve, see @ref{Convolution process}.
-As a summary, the position and magnitude of PSF profile will be ignored.
-This behavior can be disabled with the @option{--psfinimg} option.
-If you want to create all the profiles separately (with @option{--individual})
and you want the sum of the PSF profile pixels to be unity, you have to set
their magnitudes in the catalog to the zero point magnitude and be sure that
the central positions of the profiles do not have any fractional part (the PSF
center has to be in the center of the pixel).
+@cindex Moffat beta
+Again, @mymath{a} is constrained by the normalization, therefore two
parameters define the shape of the Moffat function: @mymath{\alpha} and
@mymath{\beta}.
+The radial parameter is @mymath{\alpha} which is related to the FWHM by
-The list of options directly related to the input catalog columns is shown
below.
+@cindex Moffat FWHM
+@dispmath{\rm{FWHM}_m=2\alpha\sqrt{2^{1/\beta}-1}}
-@table @option
+@cindex Compare Moffat and Gaussian
+@cindex PSF, Moffat compared Gaussian
+@noindent
+Comparing with the PSF predicted from atmospheric turbulence theory with a
Moffat function, Trujillo et al.@footnote{
+Trujillo, I., J. A. L. Aguerri, J. Cepa, and C. M. Gutierrez (2001). ``The
effects of seeing on S@'ersic profiles - II. The Moffat PSF''. In: MNRAS 328,
pp. 977---985.}
+claim that @mymath{\beta} should be 4.765.
+They also show how the Moffat PSF contains the Gaussian PSF as a limiting case
when @mymath{\beta\to\infty}.
-@item --ccol=STR/INT
-Center coordinate column for each dimension.
-This option must be called two times to define the center coordinates in an
image.
-For example, @option{--ccol=RA} and @option{--ccol=DEC} (along with
@option{--mode=wcs}) will inform MakeProfiles to look into the catalog columns
named @option{RA} and @option{DEC} for the Right Ascension and Declination of
the profile centers.
+@end table
-@item --fcol=INT/STR
-The functional form of the profile with one of the values below depending on
the desired profile.
-The column can contain either the numeric codes (for example, `@code{1}') or
string characters (for example, `@code{sersic}').
-The numeric codes are easier to use in scripts which generate catalogs with
hundreds or thousands of profiles.
+@item An input FITS image
+An input image file can also be specified to be used as a PSF.
+If the sum of its pixels are not equal to 1, the pixels will be multiplied by
a fraction so the sum does become 1.
-The string format can be easier when the catalog is to be written/checked by
hand/eye before running MakeProfiles.
-It is much more readable and provides a level of documentation.
-All Gnuastro's recognized table formats (see @ref{Recognized table formats})
accept string type columns.
-To have string columns in a plain text table/catalog, see @ref{Gnuastro text
table format}.
+Gnuastro has tools to extract the non-parametric (extended) PSF of any image
as a FITS file (assuming there are a sufficient number of stars in it), see
@ref{Building the extended PSF}.
+This method is not perfect (will have noise if you do not have many stars),
but it is the actual PSF of the data that is not forced into any parametric
form.
+@end table
+
+While the Gaussian is only dependent on the FWHM, the Moffat function is also
dependent on @mymath{\beta}.
+Comparing these two functions with a fixed FWHM gives the following results:
@itemize
@item
-S@'ersic profile with `@code{sersic}' or `@code{1}'.
-
+Within the FWHM, the functions do not have significant differences.
@item
-Moffat profile with `@code{moffat}' or `@code{2}'.
-
+For a fixed FWHM, as @mymath{\beta} increases, the Moffat function becomes
sharper.
@item
-Gaussian profile with `@code{gaussian}' or `@code{3}'.
+The Gaussian function is much sharper than the Moffat functions, even when
@mymath{\beta} is large.
+@end itemize
-@item
-Point source with `@code{point}' or `@code{4}'.
-@item
-Flat profile with `@code{flat}' or `@code{5}'.
-@item
-Circumference profile with `@code{circum}' or `@code{6}'.
-A fixed value will be used for all pixels less than or equal to the truncation
radius (@mymath{r_t}) and greater than @mymath{r_t-w} (@mymath{w} is the value
to the @option{--circumwidth}).
-@item
-Radial distance profile with `@code{distance}' or `@code{7}'.
-At the lowest level, each pixel only has an elliptical radial distance given
the profile's shape and orientation (see @ref{Defining an ellipse and
ellipsoid}).
-When this profile is chosen, the pixel's elliptical radial distance from the
profile center is written as its value.
-For this profile, the value in the magnitude column (@option{--mcol}) will be
ignored.
+@node Stars, Galaxies, PSF, Modeling basics
+@subsubsection Stars
-You can use this for checks or as a first approximation to define your own
higher-level radial function.
-In the latter case, just note that the central values are going to be
incorrect (see @ref{Sampling from a function}).
+@cindex Modeling stars
+@cindex Stars, modeling
+In MakeProfiles, stars are generally considered to be a point source.
+This is usually the case for extra galactic studies, where nearby stars are
also in the field.
+Since a star is only a point source, we assume that it only fills one pixel
prior to convolution.
+In fact, exactly for this reason, in astronomical images the light profiles of
stars are one of the best methods to understand the shape of the PSF and a very
large fraction of scientific research is preformed by assuming the shapes of
stars to be the PSF of the image.
-@item
-Custom radial profile with `@code{custom-prof}' or `@code{8}'.
-The values to use for each radial interval should be in the table given to
@option{--customtable}.
-By default, once the profile is built with the given values, it will be scaled
to have a total magnitude that you have requested in the magnitude column of
the profile (in @option{--mcol}).
-If you want the raw values in the 2D profile (to ignore the magnitude column),
use @option{--mcolnocustprof}.
-For more, see the description of @option{--customtable} in @ref{MakeProfiles
profile settings}.
-@item
-Azimuthal angle profile with `@code{azimuth}' or `@code{9}'.
-Every pixel within the truncation radius will be given its azimuthal angle (in
degrees, from 0 to 360) from the major axis.
-In combination with the radial distance profile, you can now create complex
features in polar coordinates, such as tidal tails or tidal shocks (using the
Arithmetic program to mix the radius and azimuthal angle through a function to
create your desired features).
-@item
-Custom image with `@code{custom-img}' or `@code{10}'.
-The image(s) to use should be given to the @option{--customimg} option (which
can be called multiple times for multiple images).
-To identify which one of the images (given to @option{--customimg}) should be
used, you should specify their counter in the ``radius'' column below.
-For more, see the description of @code{custom-img} in @ref{MakeProfiles
profile settings}.
-@end itemize
-@item --rcol=STR/INT
-The radius parameter of the profiles.
-Effective radius (@mymath{r_e}) if S@'ersic, FWHM if Moffat or Gaussian.
-For a custom image profile, this option is not interpreted as a radius, but as
a counter (identifying which one of the images given to @option{--customimg}
should be used for each row).
+@node Galaxies, Sampling from a function, Stars, Modeling basics
+@subsubsection Galaxies
-@item --ncol=STR/INT
-The S@'ersic index (@mymath{n}) or Moffat @mymath{\beta}.
+@cindex Galaxy profiles
+@cindex S@'ersic profile
+@cindex Profiles, galaxies
+@cindex Generalized de Vaucouleur profile
+Today, most practitioners agree that the flux of galaxies can be modeled with
one or a few generalized de Vaucouleur's (or S@'ersic) profiles.
-@item --pcol=STR/INT
-The position angle (in degrees) of the profiles relative to the first FITS
axis (horizontal when viewed in SAO DS9).
-When building a 3D profile, this is the first Euler angle: first rotation of
the ellipsoid major axis from the first FITS axis (rotating about the third
axis).
-See @ref{Defining an ellipse and ellipsoid}.
+@dispmath{I(r) = I_e \exp \left ( -b_n \left[ \left( r \over r_e \right)^{1/n}
-1 \right] \right )}
-@item --p2col=STR/INT
-Second Euler angle (in degrees) when building a 3D ellipsoid.
-This is the second rotation of the ellipsoid major axis (following
@option{--pcol}) about the (rotated) X axis.
-See @ref{Defining an ellipse and ellipsoid}.
-This column is ignored when building a 2D profile.
+@cindex Brightness
+@cindex S@'ersic, J. L.
+@cindex S@'ersic index
+@cindex Effective radius
+@cindex Radius, effective
+@cindex de Vaucouleur profile
+@cindex G@'erard de Vaucouleurs
+G@'erard de Vaucouleurs (1918-1995) was first to show in 1948 that this
function resembles the galaxy light profiles, with the only difference that he
held @mymath{n} fixed to a value of 4.
+Twenty years later in 1968, J. L. S@'ersic showed that @mymath{n} can have a
variety of values and does not necessarily need to be 4.
+This profile depends on the effective radius (@mymath{r_e}) which is defined
as the radius which contains half of the profile's 2-dimensional integral to
infinity (see @ref{Profile magnitude}).
+@mymath{I_e} is the flux at the effective radius.
+The S@'ersic index @mymath{n} is used to define the concentration of the
profile within @mymath{r_e} and @mymath{b_n} is a constant dependent on
@mymath{n}.
+MacArthur et al.@footnote{MacArthur, L. A., S. Courteau, and J. A. Holtzman
(2003). ``Structure of Disk-dominated Galaxies. I. Bulge/Disk Parameters,
Simulations, and Secular Evolution''. In: ApJ 582, pp. 689---722.} show that
for @mymath{n>0.35}, @mymath{b_n} can be accurately approximated using this
equation:
-@item --p3col=STR/INT
-Third Euler angle (in degrees) when building a 3D ellipsoid.
-This is the third rotation of the ellipsoid major axis (following
@option{--pcol} and @option{--p2col}) about the (rotated) Z axis.
-See @ref{Defining an ellipse and ellipsoid}.
-This column is ignored when building a 2D profile.
+@dispmath{b_n=2n - {1\over 3} + {4\over 405n} + {46\over 25515n^2} + {131\over
1148175n^3}-{2194697\over 30690717750n^4}}
-@item --qcol=STR/INT
-The axis ratio of the profiles (minor axis divided by the major axis in a 2D
ellipse).
-When building a 3D ellipse, this is the ratio of the major axis to the
semi-axis length of the second dimension (in a right-handed coordinate system).
-See @mymath{q1} in @ref{Defining an ellipse and ellipsoid}.
-@item --q2col=STR/INT
-The ratio of the ellipsoid major axis to the third semi-axis length (in a
right-handed coordinate system) of a 3D ellipsoid.
-See @mymath{q1} in @ref{Defining an ellipse and ellipsoid}.
-This column is ignored when building a 2D profile.
-@item --mcol=STR/INT
-The total pixelated magnitude of the profile within the truncation radius, see
@ref{Profile magnitude}.
-@item --tcol=STR/INT
-The truncation radius of this profile.
-By default it is in units of the radial parameter of the profile (the value in
the @option{--rcol} of the catalog).
-If @option{--tunitinp} is given, this value is interpreted in units of pixels
(prior to oversampling) irrespective of the profile.
-@end table
+@node Sampling from a function, Oversampling, Galaxies, Modeling basics
+@subsubsection Sampling from a function
-@node MakeProfiles profile settings, MakeProfiles output dataset, MakeProfiles
catalog, Invoking astmkprof
-@subsubsection MakeProfiles profile settings
+@cindex Sampling
+A pixel is the ultimate level of accuracy to gather data, we cannot get any
more accurate in one image, this is known as sampling in signal processing.
+However, the mathematical profiles which describe our models have infinite
accuracy.
+Over a large fraction of the area of astrophysically interesting profiles (for
example, galaxies or PSFs), the variation of the profile over the area of one
pixel is not too significant.
+In such cases, the elliptical radius (@mymath{r_{el}}) of the center of the
pixel can be assigned as the final value of the pixel, (see @ref{Defining an
ellipse and ellipsoid}).
-The profile parameters that differ between each created profile are specified
through the columns in the input catalog and described in @ref{MakeProfiles
catalog}.
-Besides those there are general settings for some profiles that do not differ
between one profile and another, they are a property of the general process.
-For example, how many random points to use in the monte-carlo integration,
this value is fixed for all the profiles.
-The options described in this section are for configuring such properties.
+@cindex Integration over pixel
+@cindex Gradient over pixel area
+@cindex Function gradient over pixel area
+As you approach their center, some galaxies become very sharp (their value
significantly changes over one pixel's area).
+This sharpness increases with smaller effective radius and larger S@'ersic
values.
+Thus rendering the central value extremely inaccurate.
+The first method that comes to mind for solving this problem is integration.
+The functional form of the profile can be integrated over the pixel area in a
2D integration process.
+However, unfortunately numerical integration techniques also have their
limitations and when such sharp profiles are needed they can become extremely
inaccurate.
-@table @option
+@cindex Monte carlo integration
+The most accurate method of sampling a continuous profile on a discrete space
is by choosing a large number of random points within the boundaries of the
pixel and taking their average value (or Monte Carlo integration).
+This is also, generally speaking, what happens in practice with the photons on
the pixel.
+The number of random points can be set with @option{--numrandom}.
-@item --mode=STR
-Interpret the center position columns (@option{--ccol} in @ref{MakeProfiles
catalog}) in image or WCS coordinates.
-This option thus accepts only two values: @option{img} and @option{wcs}.
-It is mandatory when a catalog is being used as input.
+Unfortunately, repeating this Monte Carlo process would be extremely time and
CPU consuming if it is to be applied to every pixel.
+In order to not loose too much accuracy, in MakeProfiles, the profile is built
using both methods explained below.
+The building of the profile begins from its central pixel and continues
(radially) outwards.
+Monte Carlo integration is first applied (which yields @mymath{F_r}), then the
central pixel value (@mymath{F_c}) is calculated on the same pixel.
+If the fractional difference (@mymath{|F_r-F_c|/F_r}) is lower than a given
tolerance level (specified with @option{--tolerance}) MakeProfiles will stop
using Monte Carlo integration and only use the central pixel value.
-@item -r
-@itemx --numrandom
-The number of random points used in the central regions of the profile, see
@ref{Sampling from a function}.
+@cindex Inside-out construction
+The ordering of the pixels in this inside-out construction is based on
@mymath{r=\sqrt{(i_c-i)^2+(j_c-j)^2}}, not @mymath{r_{el}}, see @ref{Defining
an ellipse and ellipsoid}.
+When the axis ratios are large (near one) this is fine.
+But when they are small and the object is highly elliptical, it might seem
more reasonable to follow @mymath{r_{el}} not @mymath{r}.
+The problem is that the gradient is stronger in pixels with smaller @mymath{r}
(and larger @mymath{r_{el}}) than those with smaller @mymath{r_{el}}.
+In other words, the gradient is strongest along the minor axis.
+So if the next pixel is chosen based on @mymath{r_{el}}, the tolerance level
will be reached sooner and lots of pixels with large fractional differences
will be missed.
+
+Monte Carlo integration uses a random number of points.
+Thus, every time you run it, by default, you will get a different distribution
of points to sample within the pixel.
+In the case of large profiles, this will result in a slight difference of the
pixels which use Monte Carlo integration each time MakeProfiles is run.
+To have a deterministic result, you have to fix the random number generator
properties which is used to build the random distribution.
+This can be done by setting the @code{GSL_RNG_TYPE} and @code{GSL_RNG_SEED}
environment variables and calling MakeProfiles with the @option{--envseed}
option.
+To learn more about the process of generating random numbers, see
@ref{Generating random numbers}.
-@item -e
-@itemx --envseed
@cindex Seed, Random number generator
@cindex Random number generator, Seed
-Use the value to the @code{GSL_RNG_SEED} environment variable to generate the
random Monte Carlo sampling distribution, see @ref{Sampling from a function}
and @ref{Generating random numbers}.
+The seed values are fixed for every profile: with @option{--envseed}, all the
profiles have the same seed and without it, each will get a different seed
using the system clock (which is accurate to within one microsecond).
+The same seed will be used to generate a random number for all the sub-pixel
positions of all the profiles.
+So in the former, the sub-pixel points checked for all the pixels undergoing
Monte carlo integration in all profiles will be identical.
+In other words, the sub-pixel points in the first (closest to the center)
pixel of all the profiles will be identical with each other.
+All the second pixels studied for all the profiles will also receive an
identical (different from the first pixel) set of sub-pixel points and so on.
+As long as the number of random points used is large enough or the profiles
are not identical, this should not cause any systematic bias.
-@item -t FLT
-@itemx --tolerance=FLT
-The tolerance to switch from Monte Carlo integration to the central pixel
value, see @ref{Sampling from a function}.
-@item -p
-@itemx --tunitinp
-The truncation column of the catalog is in units of pixels.
-By default, the truncation column is considered to be in units of the radial
parameters of the profile (@option{--rcol}).
-Read it as `t-unit-in-p' for `truncation unit in pixels'.
+@node Oversampling, , Sampling from a function, Modeling basics
+@subsubsection Oversampling
-@item -f
-@itemx --mforflatpix
-When making fixed value profiles (``flat'', ``circumference'' or ``point''
profiles, see `@option{--fcol}'), do not use the value in the column specified
by `@option{--mcol}' as the magnitude.
-Instead use it as the exact value that all the pixels of these profiles should
have.
-This option is irrelevant for other types of profiles.
-This option is very useful for creating masks, or labeled regions in an image.
-Any integer, or floating point value can used in this column with this option,
including @code{NaN} (or `@code{nan}', or `@code{NAN}', case is irrelevant),
and infinities (@code{inf}, @code{-inf}, or @code{+inf}).
+@cindex Oversampling
+The steps explained in @ref{Sampling from a function} do give an accurate
representation of a profile prior to convolution.
+However, in an actual observation, the image is first convolved with or
blurred by the atmospheric and instrument PSF in a continuous space and then it
is sampled on the discrete pixels of the camera.
-For example, with this option if you set the value in the magnitude column
(@option{--mcol}) to @code{NaN}, you can create an elliptical or circular mask
over an image (which can be given as the argument), see @ref{Blank pixels}.
-Another useful application of this option is to create labeled elliptical or
circular apertures in an image.
-To do this, set the value in the magnitude column to the label you want for
this profile.
-This labeled image can then be used in combination with NoiseChisel's output
(see @ref{NoiseChisel output}) to do aperture photometry with MakeCatalog (see
@ref{MakeCatalog}).
+@cindex PSF over-sample
+In order to more accurately simulate this process, the unconvolved image and
the PSF are created on a finer pixel grid.
+In other words, the output image is a certain odd-integer multiple of the
desired size, we can call this `oversampling'.
+The user can specify this multiple as a command-line option.
+The reason this has to be an odd number is that the PSF has to be centered on
the center of its image.
+An image with an even number of pixels on each side does not have a central
pixel.
-Alternatively, if you want to mark regions of the image (for example, with an
elliptical circumference) and you do not want to use NaN values (as explained
above) for some technical reason, you can get the minimum or maximum value in
the image @footnote{
-The minimum will give a better result, because the maximum can be too high
compared to most pixels in the image, making it harder to display.}
-using Arithmetic (see @ref{Arithmetic}), then use that value in the magnitude
column along with this option for all the profiles.
+The image can then be convolved with the PSF (which should also be oversampled
on the same scale).
+Finally, image can be sub-sampled to get to the initial desired pixel size of
the output image.
+After this, mock noise can be added as explained in the next section.
+This is because unlike the PSF, the noise occurs in each output pixel, not on
a continuous space like all the prior steps.
-Please note that when using MakeProfiles on an already existing image, you
have to set `@option{--oversample=1}'.
-Otherwise all the profiles will be scaled up based on the oversampling scale
in your configuration files (see @ref{Configuration files}) unless you have
accounted for oversampling in your catalog.
-@item --mcolissum
-The value given in the ``magnitude'' column (specified by @option{--mcol}, see
@ref{MakeProfiles catalog}) must be interpreted as total sum of pixel values,
not magnitude (which is measured from the total sum and zero point, see
@ref{Brightness flux magnitude}).
-When this option is called, the zero point magnitude (value to the
@option{--zeropoint} option) is ignored and the given value must have the same
units as the input dataset's pixels.
-Recall that the total profile magnitude that is specified with in the
@option{--mcol} column of the input catalog is not an integration to infinity,
but the actual sum of pixels in the profile (until the desired truncation
radius).
-See @ref{Profile magnitude} for more on this point.
-@item --mcolnocustprof
-Do Not touch (re-scale) the custom profile that should be inserted in
@code{custom-prof} profile (see the description of @option{--fcol} in
@ref{MakeProfiles catalog} or the description of @option{--customtable} below).
-By default, MakeProfiles will scale (multiply) the custom image's pixels to
have the desired magnitude (or sum of pixels if @option{--mcolissum} is called)
in that row.
+@node If convolving afterwards, Profile magnitude, Modeling basics,
MakeProfiles
+@subsection If convolving afterwards
-@item --mcolnocustimg
-Do Not touch (re-scale) the custom image that should be inserted in
@code{custom-img} profile (see the description of @option{--fcol} in
@ref{MakeProfiles catalog}).
-By default, MakeProfiles will scale (multiply) the custom image's pixels to
have the desired magnitude (or sum of pixels if @option{--mcolissum} is called)
in that row.
+In case you want to convolve the image later with a given point spread
function, make sure to use a larger image size.
+After convolution, the profiles become larger and a profile that is normally
completely outside of the image might fall within it.
-@item --magatpeak
-The magnitude column in the catalog (see @ref{MakeProfiles catalog}) will be
used to set the value only for the profile's peak (maximum) pixel, not the full
profile.
-Note that this is the flux of the profile's peak (maximum) pixel in the final
output of MakeProfiles.
-So beware of the oversampling, see @ref{Oversampling}.
+On one axis, if you want your final (convolved) image to be @mymath{m} pixels
and your PSF is @mymath{2n+1} pixels wide, then when calling MakeProfiles, set
the axis size to @mymath{m+2n}, not @mymath{m}.
+You also have to shift all the pixel positions of the profile centers on the
that axis by @mymath{n} pixels to the positive.
-This option can be useful if you want to check a mock profile's total
magnitude at various truncation radii.
-Without this option, no matter what the truncation radius is, the total
magnitude will be the same as that given in the catalog.
-But with this option, the total magnitude will become brighter as you increase
the truncation radius.
+After convolution, you can crop the outer @mymath{n} pixels with the section
crop box specification of Crop: @option{--section=n+1:*-n,n+1:*-n} (according
to the FITS standard, counting is from 1 so we use @code{n+1}) assuming your
PSF is a square, see @ref{Crop section syntax}.
+This will also remove all discrete Fourier transform artifacts (blurred sides)
from the final image.
+To facilitate this shift, MakeProfiles has the options @option{--xshift},
@option{--yshift} and @option{--prepforconv}, see @ref{Invoking astmkprof}.
-In sharper profiles, sometimes the accuracy of measuring the peak profile flux
is more than the overall object sum or magnitude.
-In such cases, with this option, the final profile will be built such that its
peak has the given magnitude, not the total profile.
-@cartouche
-@strong{CAUTION:} If you want to use this option for comparing with
observations, please note that MakeProfiles does not do convolution.
-Unless you have de-convolved your data, your images are convolved with the
instrument and atmospheric PSF, see @ref{PSF}.
-Particularly in sharper profiles, the flux in the peak pixel is strongly
decreased after convolution.
-Also note that in such cases, besides de-convolution, you will have to set
@option{--oversample=1} otherwise after resampling your profile with Warp (see
@ref{Warp}), the peak flux will be different.
-@end cartouche
-@item --customtable FITS/TXT
-The filename of the table to use in the custom radial profiles (see
description of @option{--fcol} in @ref{MakeProfiles catalog}.
-This can be a plain-text table, or FITS table, see @ref{Recognized table
formats}, if it is a FITS table, you can use @option{--customtablehdu} to
specify which HDU should be used (described below).
-A custom radial profile can have any value you want for a given radial profile
(including NaN/blank values).
-Each interval is defined by its minimum (inclusive) and maximum (exclusive)
radius, when a pixel center falls within a radius interval, the value specified
for that interval will be used.
-If a pixel is not in the given intervals, a value of 0.0 will be used for that
pixel.
-The table should have 3 columns as shown below.
-If the intervals are contiguous (the maximum value of the previous interval is
equal to the minimum value of an interval) and the intervals all have the same
size (difference between minimum and maximum values) the creation of these
profiles will be fast.
-However, if the intervals are not sorted and contiguous, MakeProfiles will
parse the intervals from the top of the table and use the first interval that
contains the pixel center (this may slow it down).
-@table @asis
-@item Column 1:
-The interval's minimum radius.
-@item Column 2:
-The interval's maximum radius.
-@item Column 3:
-The value to be used for pixels within the given interval (including
NaN/blank).
-@end table
-Gnuastro's column arithmetic in the Table program has the
@code{sorted-to-interval} operator that will generate the first two columns
from a single column (your radial profile).
-See the description of that operator in @ref{Column arithmetic} and the
example below.
-By default, once a 2D image is constructed for the radial profile, it will be
scaled such that its total magnitude corresponds to the value in the magnitude
column (@option{--mcol}) of the main input catalog.
-If you want to disable the scaling and use the raw values in your custom
profile (in other words: you want to ignore the magnitude column) you need to
call @option{--mcolnocustprof} (see above).
-In the example below, we'll start with a certain radial profile, and use this
option to build its 2D representation in an image (recall that you can build
radial profiles with @ref{Generate radial profile}).
-But first, we will need to use the @code{sorted-to-interval} to build the
necessary input format (see @ref{Column arithmetic}).
-@example
-$ cat radial.txt
-# Column 1: RADIUS [pix ,f32,] Radial distance
-# Column 2: MEAN [input-units,f32,] Mean of values.
-0.0 1.00000
-1.0 0.50184
-1.4 0.37121
-2.0 0.26414
-2.2 0.23427
-2.8 0.17868
-3.0 0.16627
-3.1 0.15567
-3.6 0.13132
-4.0 0.11404
+@node Profile magnitude, Invoking astmkprof, If convolving afterwards,
MakeProfiles
+@subsection Profile magnitude
-## Convert the radius in each row to an interval
-$ asttable radial.txt --output=interval.fits \
- -c'arith RADIUS sorted-to-interval',MEAN
+@cindex Truncation radius
+@cindex Sum for total flux
+To find the profile's total magnitude, (see @ref{Brightness flux magnitude}),
it is customary to use the 2D integration of the flux to infinity.
+However, in MakeProfiles we do not follow this idealistic approach and apply a
more realistic method to find the total magnitude: the sum of all the pixels
belonging to a profile within its predefined truncation radius.
+Note that if the truncation radius is not large enough, this can be
significantly different from the total integrated light to infinity.
-## Inspect the table containing intervals
-$ asttable interval.fits -ffixed
--0.500000 0.500000 1.000000
-0.500000 1.200000 0.501840
-1.200000 1.700000 0.371210
-1.700000 2.100000 0.264140
-2.100000 2.500000 0.234270
-2.500000 2.900000 0.178680
-2.900000 3.050000 0.166270
-3.050000 3.350000 0.155670
-3.350000 3.800000 0.131320
-3.800000 4.200000 0.114040
+@cindex Integration to infinity
+An integration to infinity is not a realistic condition because no galaxy
extends indefinitely (important for high S@'ersic index profiles), pixelation
can also cause a significant difference between the actual total pixel sum
value of the profile and that of integration to infinity, especially in small
and high S@'ersic index profiles.
+To be safe, you can specify a large enough truncation radius for such compact
high S@'ersic index profiles.
-## Build the 2D image of the profile from the interval.
-$ echo "1 7 7 8 10 2.5 0 1 1 2" \
- | astmkprof --mergedsize=13,13 --oversample=1 \
- --customtable=interval.fits \
- --output=image.fits
+If oversampling is used then the pixel value is calculated using the
over-sampled image, see @ref{Oversampling} which is much more accurate.
+The profile is first built in an array completely bounding it with a
normalization constant of unity (see @ref{Galaxies}).
+Taking @mymath{V} to be the desired pixel value and @mymath{S} to be the sum
of the pixels in the created profile, every pixel is then multiplied by
@mymath{V/S} so the sum is exactly @mymath{V}.
-## View the created FITS image.
-$ astscript-fits-view image.fits --ds9scale=minmax
-@end example
+If the @option{--individual} option is called, this same array is written to a
FITS file.
+If not, only the overlapping pixels of this array and the output image are
kept and added to the output array.
-Recall that if you want your image pixels to have the same values as the
@code{MEAN} column in your profile, you should run MakeProfiles with
@option{--mcolnocustprof}.
-@item --customtablehdu INT/STR
-The HDU/extension in the FITS file given to @option{--customtable}.
-@item --customimg=STR[,STR]
-A custom FITS image that should be used for the @code{custom-img} profiles
(see the description of @option{--fcol} in @ref{MakeProfiles catalog}).
-Multiple files can be given to this option (separated by a comma), and this
option can be called multiple times itself (useful when many custom image
profiles should be added).
-If the HDU of the images are different, you can use @option{--customimghdu}
(described below).
-Through the ``radius'' column, MakeProfiles will know which one of the images
given to this option should be used in each row.
-For example, let's assume your input catalog (@file{cat.fits}) has the
following contents (output of first command below), and you call MakeProfiles
like the second command below to insert four profiles into the background
@file{back.fits} image.
-The first profile below is Sersic (with an @option{--fcol}, or 4-th column,
code of @code{1}).
-So MakeProfiles builds the pixels of the first profile, and all column values
are meaningful.
-However, the second, third and fourth inserted objects are custom images (with
an @option{--fcol} code of @code{10}).
-For the custom image profiles, you see that the radius column has values of
@code{1} or @code{2}.
-This tells MakeProfiles to use the first image given to @option{--customimg}
(or @file{gal-1.fits}) for the second and fourth inserted objects.
-The second image given to @option{--customimage} (or @file{gal-2.fits}) will
be used for the third inserted object.
-Finally, all three custom image profiles have different magnitudes, and the
values in @option{--ncol}, @option{--pcol}, @option{--qcol} and @option{--tcol}
are ignored.
-@example
-$ cat cat.fits
-1 53.15506 -27.785165 1 20 1 20 0.6 25 5
-2 53.15602 -27.777887 10 1 0 0 0 22 0
-3 53.16440 -27.775876 10 2 0 0 0 24 0
-4 53.16849 -27.787406 10 1 0 0 0 23 0
+@node Invoking astmkprof, , Profile magnitude, MakeProfiles
+@subsection Invoking MakeProfiles
-$ astmkprof cat.fits --mode=wcs --zeropoint=25.68 \
- --background=back.fits --output=out.fits \
- --customimg=gal-1.fits --customimg=gal-2.fits
+MakeProfiles will make any number of profiles specified in a catalog either
individually or in one image.
+The executable name is @file{astmkprof} with the following general template
+
+@example
+$ astmkprof [OPTION ...] [Catalog]
@end example
-@item --customimghdu=INT/STR
-The HDU(s) of the images given to @option{--customimghdu}.
-If this option is only called once, but @option{--customimg} is called many
times, MakeProfiles will assume that all images given to @option{--customimg}
have the same HDU.
-Otherwise (if the number of HDUs is equal to the number of images), then each
image will use its corresponding HDU.
+@noindent
+One line examples:
-@item -X INT,INT
-@itemx --shift=INT,INT
-Shift all the profiles and enlarge the image along each dimension.
-To better understand this option, please see @mymath{n} in @ref{If convolving
afterwards}.
-This is useful when you want to convolve the image afterwards.
-If you are using an external PSF, be sure to oversample it to the same scale
used for creating the mock images.
-If a background image is specified, any possible value to this option is
ignored.
+@example
+## Make an image with profiles in catalog.txt (with default size):
+$ astmkprof catalog.txt
-@item -c
-@itemx --prepforconv
-Shift all the profiles and enlarge the image based on half the width of the
first Moffat or Gaussian profile in the catalog, considering any possible
oversampling see @ref{If convolving afterwards}.
-@option{--prepforconv} is only checked and possibly activated if
@option{--xshift} and @option{--yshift} are both zero (after reading the
command-line and configuration files).
-If a background image is specified, any possible value to this option is
ignored.
+## Make the profiles in catalog.txt over image.fits:
+$ astmkprof --background=image.fits catalog.txt
-@item -z FLT
-@itemx --zeropoint=FLT
-The zero point magnitude of the input.
-For more on the zero point magnitude, see @ref{Brightness flux magnitude}.
+## Make a Moffat PSF with FWHM 3pix, beta=2.8, truncation=5
+$ astmkprof --kernel=moffat,3,2.8,5 --oversample=1
-@item -w FLT
-@itemx --circumwidth=FLT
-The width of the circumference if the profile is to be an elliptical
circumference or annulus.
-See the explanations for this type of profile in @option{--fcol}.
+## Make profiles in catalog, using RA and Dec in the given column:
+$ astmkprof --ccol=RA_CENTER --ccol=DEC_CENTER --mode=wcs catalog.txt
-@item -R
-@itemx --replace
-Do not add the pixels of each profile over the background, or other profiles.
-But replace the values.
+## Make a 1500x1500 merged image (oversampled 500x500) image along
+## with an individual image for all the profiles in catalog:
+$ astmkprof --individual --oversample 3 --mergedsize=500,500 cat.txt
+@end example
-By default, when two profiles overlap, the final pixel value is the sum of all
the profiles that overlap on that pixel.
-This is the expected situation when dealing with physical object profiles like
galaxies or stars/PSF.
-However, when MakeProfiles is used to build integer labeled images (for
example, in @ref{Aperture photometry}), this is not the expected situation: the
sum of two labels will be a new label.
-With this option, the pixels are not added but the largest (maximum) value
over that pixel is used.
-Because the maximum operator is independent of the order of values, the output
is also thread-safe.
+@noindent
+The parameters of the mock profiles can either be given through a catalog
(which stores the parameters of many mock profiles, see @ref{MakeProfiles
catalog}), or the @option{--kernel} option (see @ref{MakeProfiles output
dataset}).
+The catalog can be in the FITS ASCII, FITS binary format, or plain text
formats (see @ref{Tables}).
+A plain text catalog can also be provided using the Standard input (see
@ref{Standard input}).
+The columns related to each parameter can be determined both by number, or by
match/search criteria using the column names, units, or comments, with the
options ending in @option{col}, see below.
-@end table
+Without any file given to the @option{--background} option, MakeProfiles will
make a zero-valued image and build the profiles on that (its size and main WCS
parameters can also be defined through the options described in
@ref{MakeProfiles output dataset}).
+Besides the main/merged image containing all the profiles in the catalog, it
is also possible to build individual images for each profile (only enclosing
one full profile to its truncation radius) with the @option{--individual}
option.
-@node MakeProfiles output dataset, MakeProfiles log file, MakeProfiles profile
settings, Invoking astmkprof
-@subsubsection MakeProfiles output dataset
-MakeProfiles takes an input catalog uses basic properties that are defined
there to build a dataset, for example, a 2D image containing the profiles in
the catalog.
-In @ref{MakeProfiles catalog} and @ref{MakeProfiles profile settings}, the
catalog and profile settings were discussed.
-The options of this section, allow you to configure the output dataset (or the
canvas that will host the built profiles).
+If an image is given to the @option{--background} option, the pixels of that
image are used as the background value for every pixel hence flux value of each
profile pixel will be added to the pixel in that background value.
+You can disable this with the @code{--clearcanvas} option (which will
initialize the background to zero-valued pixels and build profiles over that).
+With the @option{--background} option, the values to all options relating to
the ``canvas'' (output size and WCS) will be ignored if specified:
@option{--oversample}, @option{--mergedsize}, @option{--prepforconv},
@option{--crpix}, @option{--crval}, @option{--cdelt}, @option{--cdelt},
@option{--pc}, @option{cunit} and @option{ctype}.
-@table @option
+The sections below discuss the options specific to MakeProfiles based on
context: the input catalog settings which can have many rows for different
profiles are discussed in @ref{MakeProfiles catalog}, in @ref{MakeProfiles
profile settings}, we discuss how you can set general profile settings (that
are the same for all the profiles in the catalog).
+Finally @ref{MakeProfiles output dataset} and @ref{MakeProfiles log file}
discuss the outputs of MakeProfiles and how you can configure them.
+Besides these, MakeProfiles also supports all the common Gnuastro program
options that are discussed in @ref{Common options}, so please flip through them
is well for a more comfortable usage.
-@item -k FITS
-@itemx --background=FITS
-A background image FITS file to build the profiles on.
-The extension that contains the image should be specified with the
@option{--backhdu} option, see below.
-When a background image is specified, it will be used to derive all the
information about the output image.
-Hence, the following options will be ignored: @option{--mergedsize},
@option{--oversample}, @option{--crpix}, @option{--crval} (generally, all other
WCS related parameters) and the output's data type (see @option{--type} in
@ref{Input output options}).
+When building 3D profiles, there are more degrees of freedom.
+Hence, more columns are necessary and all the values related to dimensions
(for example, size of dataset in each dimension and the WCS properties) must
also have 3 values.
+To allow having an independent set of default values for creating 3D profiles,
MakeProfiles also installs a @file{astmkprof-3d.conf} configuration file (see
@ref{Configuration files}).
+You can use this for default 3D profile values.
+For example, if you installed Gnuastro with the prefix @file{/usr/local} (the
default location, see @ref{Installation directory}), you can benefit from this
configuration file by running MakeProfiles like the example below.
+As with all configuration files, if you want to customize a given option, call
it before the configuration file.
-The background image will act like a canvas to build the profiles on: profile
pixel values will be summed with the background image pixel values.
-With the @option{--replace} option you can disable this behavior and replace
the profile pixels with the background pixels.
-If you want to use all the image information above, except for the pixel
values (you want to have a blank canvas to build the profiles on, based on an
input image), you can call @option{--clearcanvas}, to set all the input image's
pixels to zero before starting to build the profiles over it (this is done in
memory after reading the input, so nothing will happen to your input file).
+@example
+$ astmkprof --config=/usr/local/etc/astmkprof-3d.conf catalog.txt
+@end example
-@item -B STR/INT
-@itemx --backhdu=STR/INT
-The header data unit (HDU) of the file given to @option{--background}.
+@cindex Shell alias
+@cindex Alias, shell
+@cindex Shell startup
+@cindex Startup, shell
+To further simplify the process, you can define a shell alias in any startup
file (for example, @file{~/.bashrc}, see @ref{Installation directory}).
+Assuming that you installed Gnuastro in @file{/usr/local}, you can add this
line to the startup file (you may put it all in one line, it is broken into two
lines here for fitting within page limits).
-@item -C
-@itemx --clearcanvas
-When an input image is specified (with the @option{--background} option, set
all its pixels to 0.0 immediately after reading it into memory.
-Effectively, this will allow you to use all its properties (described under
the @option{--background} option), without having to worry about the pixel
values.
+@example
+alias astmkprof-3d="astmkprof --config=/usr/local/etc/astmkprof-3d.conf"
+@end example
-@option{--clearcanvas} can come in handy in many situations, for example, if
you want to create a labeled image (segmentation map) for creating a catalog
(see @ref{MakeCatalog}).
-In other cases, you might have modeled the objects in an image and want to
create them on the same frame, but without the original pixel values.
+@noindent
+Using this alias, you can call MakeProfiles with the name
@command{astmkprof-3d} (instead of @command{astmkprof}).
+It will automatically load the 3D specific configuration file first, and then
parse any other arguments, options or configuration files.
+You can change the default values in this 3D configuration file by calling
them on the command-line as you do with @command{astmkprof}@footnote{Recall
that for single-invocation options, the last command-line invocation takes
precedence over all previous invocations (including those in the 3D
configuration file).
+See the description of @option{--config} in @ref{Operating mode options}.}.
-@item -E STR/INT,FLT[,FLT,[...]]
-@itemx --kernel=STR/INT,FLT[,FLT,[...]]
-Only build one kernel profile with the parameters given as the values to this
option.
-The different values must be separated by a comma (@key{,}).
-The first value identifies the radial function of the profile, either through
a string or through a number (see description of @option{--fcol} in
@ref{MakeProfiles catalog}).
-Each radial profile needs a different total number of parameters: S@'ersic and
Moffat functions need 3 parameters: radial, S@'ersic index or Moffat
@mymath{\beta}, and truncation radius.
-The Gaussian function needs two parameters: radial and truncation radius.
-The point function does not need any parameters and flat and circumference
profiles just need one parameter (truncation radius).
+Please see @ref{Sufi simulates a detection} for a very complete tutorial
explaining how one could use MakeProfiles in conjunction with other Gnuastro's
programs to make a complete simulated image of a mock galaxy.
-The PSF or kernel is a unique (and highly constrained) type of profile: the
sum of its pixels must be one, its center must be the center of the central
pixel (in an image with an odd number of pixels on each side), and commonly it
is circular, so its axis ratio and position angle are one and zero respectively.
-Kernels are commonly necessary for various data analysis and data manipulation
steps (for example, see @ref{Convolve}, and @ref{NoiseChisel}.
-Because of this it is inconvenient to define a catalog with one row and many
zero valued columns (for all the non-necessary parameters).
-Hence, with this option, it is possible to create a kernel with MakeProfiles
without the need to create a catalog.
-Here are some examples:
+@menu
+* MakeProfiles catalog:: Required catalog properties.
+* MakeProfiles profile settings:: Configuration parameters for all profiles.
+* MakeProfiles output dataset:: The canvas/dataset to build profiles over.
+* MakeProfiles log file:: A description of the optional log file.
+@end menu
-@table @option
-@item --kernel=moffat,3,2.8,5
-A Moffat kernel with FWHM of 3 pixels, @mymath{\beta=2.8} which is truncated
at 5 times the FWHM.
+@node MakeProfiles catalog, MakeProfiles profile settings, Invoking astmkprof,
Invoking astmkprof
+@subsubsection MakeProfiles catalog
+The catalog containing information about each profile can be in the FITS
ASCII, FITS binary, or plain text formats (see @ref{Tables}).
+The latter can also be provided using standard input (see @ref{Standard
input}).
+Its columns can be ordered in any desired manner.
+You can specify which columns belong to which parameters using the set of
options discussed below.
+For example, through the @option{--rcol} and @option{--tcol} options, you can
specify the column that contains the radial parameter for each profile and its
truncation respectively.
+See @ref{Selecting table columns} for a thorough discussion on the values to
these options.
-@item --kernel=gaussian,2,3
-A circular Gaussian kernel with FWHM of 2 pixels and truncated at 3 times
-the FWHM.
-@end table
+The value for the profile center in the catalog (the @option{--ccol} option)
can be a floating point number so the profile center can be on any sub-pixel
position.
+Note that pixel positions in the FITS standard start from 1 and an integer is
the pixel center.
+So a 2D image actually starts from the position (0.5, 0.5), which is the
bottom-left corner of the first pixel.
+When a @option{--background} image with WCS information is provided, or you
specify the WCS parameters with the respective options@footnote{The options to
set the WCS are the following: @option{--crpix}, @option{--crval},
@option{--cdelt}, @option{--cdelt}, @option{--pc}, @option{cunit} and
@option{ctype}.
+Just recall that these options are only used if @option{--background} is not
given: if the image you give to @option{--background} does not have WCS, these
options will not be used and you cannot use WCS-mode coordinates like RA or
Dec.}, you may also use RA and Dec to identify the center of each profile (see
the @option{--mode} option below).
-This option may also be used to create a 3D kernel.
-To do that, two small modifications are necessary: add a @code{-3d} (or
@code{-3D}) to the profile name (for example, @code{moffat-3d}) and add a
number (axis-ratio along the third dimension) to the end of the parameters for
all profiles except @code{point}.
-The main reason behind providing an axis ratio in the third dimension is that
in 3D astronomical datasets, commonly the third dimension does not have the
same nature (units/sampling) as the first and second.
+In MakeProfiles, profile centers do not have to be in (overlap with) the final
image.
+Even if only one pixel of the profile within the truncation radius overlaps
with the final image size, the profile is built and included in the final image.
+Profiles that are completely out of the image will not be created (unless you
explicitly ask for it with the @option{--individual} option).
+You can use the output log file (created with @option{--log} to see which
profiles were within the image, see @ref{Common options}.
-For example, in IFU (optical) or Radio data cubes, the first and second
dimensions are commonly spatial/angular positions (like RA and Dec) but the
third dimension is wavelength or frequency (in units of Angstroms for Herz).
-Because of this different nature (which also affects the processing), it may
be necessary for the kernel to have a different extent in that direction.
+If PSF profiles (Moffat or Gaussian, see @ref{PSF}) are in the catalog and the
profiles are to be built in one image (when @option{--individual} is not used),
it is assumed they are the PSF(s) you want to convolve your created image with.
+So by default, they will not be built in the output image but as separate
files.
+The sum of pixels of these separate files will also be set to unity (1) so you
are ready to convolve, see @ref{Convolution process}.
+As a summary, the position and magnitude of PSF profile will be ignored.
+This behavior can be disabled with the @option{--psfinimg} option.
+If you want to create all the profiles separately (with @option{--individual})
and you want the sum of the PSF profile pixels to be unity, you have to set
their magnitudes in the catalog to the zero point magnitude and be sure that
the central positions of the profiles do not have any fractional part (the PSF
center has to be in the center of the pixel).
-If the 3rd dimension axis ratio is equal to @mymath{1.0}, then the kernel will
be a spheroid.
-If it is smaller than @mymath{1.0}, the kernel will be button-shaped: extended
less in the third dimension.
-However, when it islarger than @mymath{1.0}, the kernel will be bullet-shaped:
extended more in the third dimension.
-In the latter case, the radial parameter will correspond to the length along
the 3rd dimension.
-For example, let's have a look at the two examples above but in 3D:
+The list of options directly related to the input catalog columns is shown
below.
@table @option
-@item --kernel=moffat-3d,3,2.8,5,0.5
-An ellipsoid Moffat kernel with FWHM of 3 pixels, @mymath{\beta=2.8} which is
truncated at 5 times the FWHM.
-The ellipsoid is circular in the first two dimensions, but in the third
dimension its extent is half the first two.
-@item --kernel=gaussian-3d,2,3,1
-A spherical Gaussian kernel with FWHM of 2 pixels and truncated at 3 times
-the FWHM.
-@end table
+@item --ccol=STR/INT
+Center coordinate column for each dimension.
+This option must be called two times to define the center coordinates in an
image.
+For example, @option{--ccol=RA} and @option{--ccol=DEC} (along with
@option{--mode=wcs}) will inform MakeProfiles to look into the catalog columns
named @option{RA} and @option{DEC} for the Right Ascension and Declination of
the profile centers.
-Of course, if a specific kernel is needed that does not fit the constraints
imposed by this option, you can always use a catalog to define any arbitrary
kernel.
-Just call the @option{--individual} and @option{--nomerged} options to make
sure that it is built as a separate file (individually) and no ``merged'' image
of the input profiles is created.
+@item --fcol=INT/STR
+The functional form of the profile with one of the values below depending on
the desired profile.
+The column can contain either the numeric codes (for example, `@code{1}') or
string characters (for example, `@code{sersic}').
+The numeric codes are easier to use in scripts which generate catalogs with
hundreds or thousands of profiles.
-@item -x INT,INT
-@itemx --mergedsize=INT,INT
-The number of pixels along each axis of the output, in FITS order.
-This is before over-sampling.
-For example, if you call MakeProfiles with @option{--mergedsize=100,150
--oversample=5} (assuming no shift due for later convolution), then the final
image size along the first axis will be 500 by 750 pixels.
-Fractions are acceptable as values for each dimension, however, they must
reduce to an integer, so @option{--mergedsize=150/3,300/3} is acceptable but
@option{--mergedsize=150/4,300/4} is not.
+The string format can be easier when the catalog is to be written/checked by
hand/eye before running MakeProfiles.
+It is much more readable and provides a level of documentation.
+All Gnuastro's recognized table formats (see @ref{Recognized table formats})
accept string type columns.
+To have string columns in a plain text table/catalog, see @ref{Gnuastro text
table format}.
-When viewing a FITS image in DS9, the first FITS dimension is in the
horizontal direction and the second is vertical.
-As an example, the image created with the example above will have 500 pixels
horizontally and 750 pixels vertically.
+@itemize
+@item
+S@'ersic profile with `@code{sersic}' or `@code{1}'.
-If a background image is specified, this option is ignored.
+@item
+Moffat profile with `@code{moffat}' or `@code{2}'.
-@item -s INT
-@itemx --oversample=INT
-The scale to over-sample the profiles and final image.
-If not an odd number, will be added by one, see @ref{Oversampling}.
-Note that this @option{--oversample} will remain active even if an input image
is specified.
-If your input catalog is based on the background image, be sure to set
@option{--oversample=1}.
+@item
+Gaussian profile with `@code{gaussian}' or `@code{3}'.
-@item --psfinimg
-Build the possibly existing PSF profiles (Moffat or Gaussian) in the catalog
into the final image.
-By default they are built separately so you can convolve your images with
them, thus their magnitude and positions are ignored.
-With this option, they will be built in the final image like every other
galaxy profile.
-To have a final PSF in your image, make a point profile where you want the PSF
and after convolution it will be the PSF.
+@item
+Point source with `@code{point}' or `@code{4}'.
-@item -i
-@itemx --individual
-@cindex Individual profiles
-@cindex Build individual profiles
-If this option is called, each profile is created in a separate FITS file
within the same directory as the output and the row number of the profile
(starting from zero) in the name.
-The file for each row's profile will be in the same directory as the final
combined image of all the profiles and will have the final image's name as a
suffix.
-So for example, if the final combined image is named
@file{./out/fromcatalog.fits}, then the first profile that will be created with
this option will be named @file{./out/0_fromcatalog.fits}.
+@item
+Flat profile with `@code{flat}' or `@code{5}'.
-Since each image only has one full profile out to the truncation radius the
profile is centered and so, only the sub-pixel position of the profile center
is important for the outputs of this option.
-The output will have an odd number of pixels.
-If there is no oversampling, the central pixel will contain the profile center.
-If the value to @option{--oversample} is larger than unity, then the profile
center is on any of the central @option{--oversample}'d pixels depending on the
fractional value of the profile center.
+@item
+Circumference profile with `@code{circum}' or `@code{6}'.
+A fixed value will be used for all pixels less than or equal to the truncation
radius (@mymath{r_t}) and greater than @mymath{r_t-w} (@mymath{w} is the value
to the @option{--circumwidth}).
-If the fractional value is larger than half, it is on the bottom half of the
central region.
-This is due to the FITS definition of a real number position: The center of a
pixel has fractional value @mymath{0.00} so each pixel contains these
fractions: .5 -- .75 -- .00 (pixel center) -- .25 -- .5.
+@item
+Radial distance profile with `@code{distance}' or `@code{7}'.
+At the lowest level, each pixel only has an elliptical radial distance given
the profile's shape and orientation (see @ref{Defining an ellipse and
ellipsoid}).
+When this profile is chosen, the pixel's elliptical radial distance from the
profile center is written as its value.
+For this profile, the value in the magnitude column (@option{--mcol}) will be
ignored.
-@item -m
-@itemx --nomerged
-Do Not make a merged image.
-By default after making the profiles, they are added to a final image with
side lengths specified by @option{--mergedsize} if they overlap with it.
+You can use this for checks or as a first approximation to define your own
higher-level radial function.
+In the latter case, just note that the central values are going to be
incorrect (see @ref{Sampling from a function}).
-@end table
+@item
+Custom radial profile with `@code{custom-prof}' or `@code{8}'.
+The values to use for each radial interval should be in the table given to
@option{--customtable}.
+By default, once the profile is built with the given values, it will be scaled
to have a total magnitude that you have requested in the magnitude column of
the profile (in @option{--mcol}).
+If you want the raw values in the 2D profile (to ignore the magnitude column),
use @option{--mcolnocustprof}.
+For more, see the description of @option{--customtable} in @ref{MakeProfiles
profile settings}.
+
+@item
+Azimuthal angle profile with `@code{azimuth}' or `@code{9}'.
+Every pixel within the truncation radius will be given its azimuthal angle (in
degrees, from 0 to 360) from the major axis.
+In combination with the radial distance profile, you can now create complex
features in polar coordinates, such as tidal tails or tidal shocks (using the
Arithmetic program to mix the radius and azimuthal angle through a function to
create your desired features).
+@item
+Custom image with `@code{custom-img}' or `@code{10}'.
+The image(s) to use should be given to the @option{--customimg} option (which
can be called multiple times for multiple images).
+To identify which one of the images (given to @option{--customimg}) should be
used, you should specify their counter in the ``radius'' column below.
+For more, see the description of @code{custom-img} in @ref{MakeProfiles
profile settings}.
+@end itemize
-@noindent
-The options below can be used to define the world coordinate system (WCS)
properties of the MakeProfiles outputs.
-The option names are deliberately chosen to be the same as the FITS standard
WCS keywords.
-See Section 8 of @url{https://doi.org/10.1051/0004-6361/201015362, Pence et al
[2010]} for a short introduction to WCS in the FITS standard@footnote{The world
coordinate standard in FITS is a very beautiful and powerful concept to
link/associate datasets with the outside world (other datasets).
-The description in the FITS standard (link above) only touches the tip of the
ice-burg.
-To learn more please see @url{https://doi.org/10.1051/0004-6361:20021326,
Greisen and Calabretta [2002]},
@url{https://doi.org/10.1051/0004-6361:20021327, Calabretta and Greisen
[2002]}, @url{https://doi.org/10.1051/0004-6361:20053818, Greisen et al.
[2006]}, and
@url{http://www.atnf.csiro.au/people/mcalabre/WCS/dcs_20040422.pdf, Calabretta
et al.}}.
+@item --rcol=STR/INT
+The radius parameter of the profiles.
+Effective radius (@mymath{r_e}) if S@'ersic, FWHM if Moffat or Gaussian.
-If you look into the headers of a FITS image with WCS for example, you will
see all these names but in uppercase and with numbers to represent the
dimensions, for example, @code{CRPIX1} and @code{PC2_1}.
-You can see the FITS headers with Gnuastro's @ref{Fits} program using a
command like this: @command{$ astfits -p image.fits}.
+For a custom image profile, this option is not interpreted as a radius, but as
a counter (identifying which one of the images given to @option{--customimg}
should be used for each row).
-If the values given to any of these options does not correspond to the number
of dimensions in the output dataset, then no WCS information will be added.
-Also recall that if you use the @option{--background} option, all of these
options are ignored.
-Such that if the image given to @option{--background} does not have any WCS,
the output of MakeProfiles will also not have any WCS, even if these options
are given@footnote{If you want to add profiles @emph{and} WCS over the
background image (to produce your output), you need more than one command:
-1. You should use @option{--mergedsize} in MakeProfiles to manually set the
output number of pixels equal to your desired background image (so the
background is zero).
-In this mode, you can use these WCS-related options to define the WCS.
-2. Then use Arithmetic to add the pixels of your mock image to the background
(see @ref{Arithmetic}.}.
+@item --ncol=STR/INT
+The S@'ersic index (@mymath{n}) or Moffat @mymath{\beta}.
-@table @option
+@item --pcol=STR/INT
+The position angle (in degrees) of the profiles relative to the first FITS
axis (horizontal when viewed in SAO DS9).
+When building a 3D profile, this is the first Euler angle: first rotation of
the ellipsoid major axis from the first FITS axis (rotating about the third
axis).
+See @ref{Defining an ellipse and ellipsoid}.
-@item --crpix=FLT,FLT
-The pixel coordinates of the WCS reference point.
-Fractions are acceptable for the values of this option.
+@item --p2col=STR/INT
+Second Euler angle (in degrees) when building a 3D ellipsoid.
+This is the second rotation of the ellipsoid major axis (following
@option{--pcol}) about the (rotated) X axis.
+See @ref{Defining an ellipse and ellipsoid}.
+This column is ignored when building a 2D profile.
-@item --crval=FLT,FLT
-The WCS coordinates of the Reference point.
-Fractions are acceptable for the values of this option.
-The comma-separated values can either be in degrees (a single number), or
sexagesimal (@code{_h_m_} for RA, @code{_d_m_} for Dec, or @code{_:_:_} for
both).
-In any case, the final value that will be written in the @code{CRVAL} keyword
will be a floating point number in degrees (according to the FITS standard).
+@item --p3col=STR/INT
+Third Euler angle (in degrees) when building a 3D ellipsoid.
+This is the third rotation of the ellipsoid major axis (following
@option{--pcol} and @option{--p2col}) about the (rotated) Z axis.
+See @ref{Defining an ellipse and ellipsoid}.
+This column is ignored when building a 2D profile.
-@item --cdelt=FLT,FLT
-The resolution (size of one data-unit or pixel in WCS units) of the
non-oversampled dataset.
-Fractions are acceptable for the values of this option.
+@item --qcol=STR/INT
+The axis ratio of the profiles (minor axis divided by the major axis in a 2D
ellipse).
+When building a 3D ellipse, this is the ratio of the major axis to the
semi-axis length of the second dimension (in a right-handed coordinate system).
+See @mymath{q1} in @ref{Defining an ellipse and ellipsoid}.
-@item --pc=FLT,FLT,FLT,FLT
-The PC matrix of the WCS rotation, see the FITS standard (link above) to
better understand the PC matrix.
+@item --q2col=STR/INT
+The ratio of the ellipsoid major axis to the third semi-axis length (in a
right-handed coordinate system) of a 3D ellipsoid.
+See @mymath{q1} in @ref{Defining an ellipse and ellipsoid}.
+This column is ignored when building a 2D profile.
-@item --cunit=STR,STR
-The units of each WCS axis, for example, @code{deg}.
-Note that these values are part of the FITS standard (link above).
-MakeProfiles will not complain if you use non-standard values, but later usage
of them might cause trouble.
+@item --mcol=STR/INT
+The total pixelated magnitude of the profile within the truncation radius, see
@ref{Profile magnitude}.
-@item --ctype=STR,STR
-The type of each WCS axis, for example, @code{RA---TAN} and @code{DEC--TAN}.
-Note that these values are part of the FITS standard (link above).
-MakeProfiles will not complain if you use non-standard values, but later usage
of them might cause trouble.
+@item --tcol=STR/INT
+The truncation radius of this profile.
+By default it is in units of the radial parameter of the profile (the value in
the @option{--rcol} of the catalog).
+If @option{--tunitinp} is given, this value is interpreted in units of pixels
(prior to oversampling) irrespective of the profile.
@end table
-@node MakeProfiles log file, , MakeProfiles output dataset, Invoking astmkprof
-@subsubsection MakeProfiles log file
+@node MakeProfiles profile settings, MakeProfiles output dataset, MakeProfiles
catalog, Invoking astmkprof
+@subsubsection MakeProfiles profile settings
-Besides the final merged dataset of all the profiles, or the individual
datasets (see @ref{MakeProfiles output dataset}), if the @option{--log} option
is called MakeProfiles will also create a log file in the current directory
(where you run MockProfiles).
-See @ref{Common options} for a full description of @option{--log} and other
options that are shared between all Gnuastro programs.
-The values for each column are explained in the first few commented lines of
the log file (starting with @command{#} character).
-Here is a more complete description.
+The profile parameters that differ between each created profile are specified
through the columns in the input catalog and described in @ref{MakeProfiles
catalog}.
+Besides those there are general settings for some profiles that do not differ
between one profile and another, they are a property of the general process.
+For example, how many random points to use in the monte-carlo integration,
this value is fixed for all the profiles.
+The options described in this section are for configuring such properties.
-@itemize
-@item
-An ID (row number of profile in input catalog).
+@table @option
-@item
-The total magnitude of the profile in the output dataset.
-When the profile does not completely overlap with the output dataset, this
will be different from your input magnitude.
+@item --mode=STR
+Interpret the center position columns (@option{--ccol} in @ref{MakeProfiles
catalog}) in image or WCS coordinates.
+This option thus accepts only two values: @option{img} and @option{wcs}.
+It is mandatory when a catalog is being used as input.
-@item
-The number of pixels (in the oversampled image) which used Monte Carlo
integration and not the central pixel value, see @ref{Sampling from a function}.
+@item -r
+@itemx --numrandom
+The number of random points used in the central regions of the profile, see
@ref{Sampling from a function}.
-@item
-The fraction of flux in the Monte Carlo integrated pixels.
+@item -e
+@itemx --envseed
+@cindex Seed, Random number generator
+@cindex Random number generator, Seed
+Use the value to the @code{GSL_RNG_SEED} environment variable to generate the
random Monte Carlo sampling distribution, see @ref{Sampling from a function}
and @ref{Generating random numbers}.
-@item
-If an individual image was created, this column will have a value of @code{1},
otherwise it will have a value of @code{0}.
-@end itemize
+@item -t FLT
+@itemx --tolerance=FLT
+The tolerance to switch from Monte Carlo integration to the central pixel
value, see @ref{Sampling from a function}.
+
+@item -p
+@itemx --tunitinp
+The truncation column of the catalog is in units of pixels.
+By default, the truncation column is considered to be in units of the radial
parameters of the profile (@option{--rcol}).
+Read it as `t-unit-in-p' for `truncation unit in pixels'.
+@item -f
+@itemx --mforflatpix
+When making fixed value profiles (``flat'', ``circumference'' or ``point''
profiles, see `@option{--fcol}'), do not use the value in the column specified
by `@option{--mcol}' as the magnitude.
+Instead use it as the exact value that all the pixels of these profiles should
have.
+This option is irrelevant for other types of profiles.
+This option is very useful for creating masks, or labeled regions in an image.
+Any integer, or floating point value can used in this column with this option,
including @code{NaN} (or `@code{nan}', or `@code{NAN}', case is irrelevant),
and infinities (@code{inf}, @code{-inf}, or @code{+inf}).
+For example, with this option if you set the value in the magnitude column
(@option{--mcol}) to @code{NaN}, you can create an elliptical or circular mask
over an image (which can be given as the argument), see @ref{Blank pixels}.
+Another useful application of this option is to create labeled elliptical or
circular apertures in an image.
+To do this, set the value in the magnitude column to the label you want for
this profile.
+This labeled image can then be used in combination with NoiseChisel's output
(see @ref{NoiseChisel output}) to do aperture photometry with MakeCatalog (see
@ref{MakeCatalog}).
+Alternatively, if you want to mark regions of the image (for example, with an
elliptical circumference) and you do not want to use NaN values (as explained
above) for some technical reason, you can get the minimum or maximum value in
the image @footnote{
+The minimum will give a better result, because the maximum can be too high
compared to most pixels in the image, making it harder to display.}
+using Arithmetic (see @ref{Arithmetic}), then use that value in the magnitude
column along with this option for all the profiles.
+Please note that when using MakeProfiles on an already existing image, you
have to set `@option{--oversample=1}'.
+Otherwise all the profiles will be scaled up based on the oversampling scale
in your configuration files (see @ref{Configuration files}) unless you have
accounted for oversampling in your catalog.
+@item --mcolissum
+The value given in the ``magnitude'' column (specified by @option{--mcol}, see
@ref{MakeProfiles catalog}) must be interpreted as total sum of pixel values,
not magnitude (which is measured from the total sum and zero point, see
@ref{Brightness flux magnitude}).
+When this option is called, the zero point magnitude (value to the
@option{--zeropoint} option) is ignored and the given value must have the same
units as the input dataset's pixels.
+Recall that the total profile magnitude that is specified with in the
@option{--mcol} column of the input catalog is not an integration to infinity,
but the actual sum of pixels in the profile (until the desired truncation
radius).
+See @ref{Profile magnitude} for more on this point.
+@item --mcolnocustprof
+Do Not touch (re-scale) the custom profile that should be inserted in
@code{custom-prof} profile (see the description of @option{--fcol} in
@ref{MakeProfiles catalog} or the description of @option{--customtable} below).
+By default, MakeProfiles will scale (multiply) the custom image's pixels to
have the desired magnitude (or sum of pixels if @option{--mcolissum} is called)
in that row.
+@item --mcolnocustimg
+Do Not touch (re-scale) the custom image that should be inserted in
@code{custom-img} profile (see the description of @option{--fcol} in
@ref{MakeProfiles catalog}).
+By default, MakeProfiles will scale (multiply) the custom image's pixels to
have the desired magnitude (or sum of pixels if @option{--mcolissum} is called)
in that row.
+@item --magatpeak
+The magnitude column in the catalog (see @ref{MakeProfiles catalog}) will be
used to set the value only for the profile's peak (maximum) pixel, not the full
profile.
+Note that this is the flux of the profile's peak (maximum) pixel in the final
output of MakeProfiles.
+So beware of the oversampling, see @ref{Oversampling}.
+This option can be useful if you want to check a mock profile's total
magnitude at various truncation radii.
+Without this option, no matter what the truncation radius is, the total
magnitude will be the same as that given in the catalog.
+But with this option, the total magnitude will become brighter as you increase
the truncation radius.
+In sharper profiles, sometimes the accuracy of measuring the peak profile flux
is more than the overall object sum or magnitude.
+In such cases, with this option, the final profile will be built such that its
peak has the given magnitude, not the total profile.
-@node MakeNoise, , MakeProfiles, Data modeling
-@section MakeNoise
+@cartouche
+@strong{CAUTION:} If you want to use this option for comparing with
observations, please note that MakeProfiles does not do convolution.
+Unless you have de-convolved your data, your images are convolved with the
instrument and atmospheric PSF, see @ref{PSF}.
+Particularly in sharper profiles, the flux in the peak pixel is strongly
decreased after convolution.
+Also note that in such cases, besides de-convolution, you will have to set
@option{--oversample=1} otherwise after resampling your profile with Warp (see
@ref{Warp}), the peak flux will be different.
+@end cartouche
-@cindex Noise
-Real data are always buried in noise, therefore to finalize a simulation of
real data (for example, to test our observational algorithms) it is essential
to add noise to the mock profiles created with MakeProfiles, see
@ref{MakeProfiles}.
-Below, the general principles and concepts to help understand how noise is
quantified is discussed.
-MakeNoise options and argument are then discussed in @ref{Invoking astmknoise}.
+@item --customtable FITS/TXT
+The filename of the table to use in the custom radial profiles (see
description of @option{--fcol} in @ref{MakeProfiles catalog}.
+This can be a plain-text table, or FITS table, see @ref{Recognized table
formats}, if it is a FITS table, you can use @option{--customtablehdu} to
specify which HDU should be used (described below).
-@menu
-* Noise basics:: Noise concepts and definitions.
-* Invoking astmknoise:: Options and arguments to MakeNoise.
-@end menu
+A custom radial profile can have any value you want for a given radial profile
(including NaN/blank values).
+Each interval is defined by its minimum (inclusive) and maximum (exclusive)
radius, when a pixel center falls within a radius interval, the value specified
for that interval will be used.
+If a pixel is not in the given intervals, a value of 0.0 will be used for that
pixel.
+The table should have 3 columns as shown below.
+If the intervals are contiguous (the maximum value of the previous interval is
equal to the minimum value of an interval) and the intervals all have the same
size (difference between minimum and maximum values) the creation of these
profiles will be fast.
+However, if the intervals are not sorted and contiguous, MakeProfiles will
parse the intervals from the top of the table and use the first interval that
contains the pixel center (this may slow it down).
+@table @asis
+@item Column 1:
+The interval's minimum radius.
+@item Column 2:
+The interval's maximum radius.
+@item Column 3:
+The value to be used for pixels within the given interval (including
NaN/blank).
+@end table
-@node Noise basics, Invoking astmknoise, MakeNoise, MakeNoise
-@subsection Noise basics
+Gnuastro's column arithmetic in the Table program has the
@code{sorted-to-interval} operator that will generate the first two columns
from a single column (your radial profile).
+See the description of that operator in @ref{Column arithmetic} and the
example below.
-@cindex Noise
-@cindex Image noise
-Deep astronomical images, like those used in extragalactic studies, seriously
suffer from noise in the data.
-Generally speaking, the sources of noise in an astronomical image are photon
counting noise and Instrumental noise which are discussed in @ref{Photon
counting noise} and @ref{Instrumental noise}.
-This review finishes with @ref{Generating random numbers} which is a short
introduction on how random numbers are generated.
-We will see that while software random number generators are not perfect, they
allow us to obtain a reproducible series of random numbers through setting the
random number generator function and seed value.
-Therefore in this section, we will also discuss how you can set these two
parameters in Gnuastro's programs (including MakeNoise).
+By default, once a 2D image is constructed for the radial profile, it will be
scaled such that its total magnitude corresponds to the value in the magnitude
column (@option{--mcol}) of the main input catalog.
+If you want to disable the scaling and use the raw values in your custom
profile (in other words: you want to ignore the magnitude column) you need to
call @option{--mcolnocustprof} (see above).
-@menu
-* Photon counting noise:: Poisson noise
-* Instrumental noise:: Readout, dark current and other sources.
-* Final noised pixel value:: How the final noised value is calculated.
-* Generating random numbers:: How random numbers are generated.
-@end menu
+In the example below, we'll start with a certain radial profile, and use this
option to build its 2D representation in an image (recall that you can build
radial profiles with @ref{Generate radial profile}).
+But first, we will need to use the @code{sorted-to-interval} to build the
necessary input format (see @ref{Column arithmetic}).
-@node Photon counting noise, Instrumental noise, Noise basics, Noise basics
-@subsubsection Photon counting noise
+@example
+$ cat radial.txt
+# Column 1: RADIUS [pix ,f32,] Radial distance
+# Column 2: MEAN [input-units,f32,] Mean of values.
+0.0 1.00000
+1.0 0.50184
+1.4 0.37121
+2.0 0.26414
+2.2 0.23427
+2.8 0.17868
+3.0 0.16627
+3.1 0.15567
+3.6 0.13132
+4.0 0.11404
-@cindex Counting error
-@cindex de Moivre, Abraham
-@cindex Poisson distribution
-@cindex Photon counting noise
-@cindex Poisson, Sim@'eon Denis
-With the very accurate electronics used in today's detectors, photon counting
noise@footnote{In practice, we are actually counting the electrons that are
produced by each photon, not the actual photons.} is the most significant
source of uncertainty in most datasets.
-To understand this noise (error in counting) and its effect on the images of
astronomical targets, let's start by reviewing how a distribution produced by
counting can be modeled as a parametric function.
+## Convert the radius in each row to an interval
+$ asttable radial.txt --output=interval.fits \
+ -c'arith RADIUS sorted-to-interval',MEAN
-Counting is an inherently discrete operation, which can only produce positive
integer outputs (including zero).
-For example, we cannot count @mymath{3.2} or @mymath{-2} of anything.
-We only count @mymath{0}, @mymath{1}, @mymath{2}, @mymath{3} and so on.
-The distribution of values, as a result of counting efforts is formally known
as the @url{https://en.wikipedia.org/wiki/Poisson_distribution, Poisson
distribution}.
-It is associated to Sim@'eon Denis Poisson, because he discussed it while
working on the number of wrongful convictions in court cases in his 1837
book@footnote{[From Wikipedia] Poisson's result was also derived in a previous
study by Abraham de Moivre in 1711.
-Therefore some people suggest it should rightly be called the de Moivre
distribution.}.
+## Inspect the table containing intervals
+$ asttable interval.fits -ffixed
+-0.500000 0.500000 1.000000
+0.500000 1.200000 0.501840
+1.200000 1.700000 0.371210
+1.700000 2.100000 0.264140
+2.100000 2.500000 0.234270
+2.500000 2.900000 0.178680
+2.900000 3.050000 0.166270
+3.050000 3.350000 0.155670
+3.350000 3.800000 0.131320
+3.800000 4.200000 0.114040
-@cindex Probability density function
-Let's take @mymath{\lambda} to represent the expected mean count of something.
-Furthermore, let's take @mymath{k} to represent the output of a counting
attempt (hence @mymath{k} is a positive integer).
-The probability density function of getting @mymath{k} counts (in each
attempt, given the expected/mean count of @mymath{\lambda}) can be written as:
+## Build the 2D image of the profile from the interval.
+$ echo "1 7 7 8 10 2.5 0 1 1 2" \
+ | astmkprof --mergedsize=13,13 --oversample=1 \
+ --customtable=interval.fits \
+ --output=image.fits
-@cindex Poisson distribution
-@dispmath{f(k)={\lambda^k \over k!} e^{-\lambda},\quad k\in @{0, 1, 2, 3,
\dots @}}
+## View the created FITS image.
+$ astscript-fits-view image.fits --ds9scale=minmax
+@end example
-@cindex Skewed Poisson distribution
-Because the Poisson distribution is only applicable to positive integer values
(note the factorial operator, which only applies to non-negative integers),
naturally it is very skewed when @mymath{\lambda} is near zero.
-One qualitative way to understand this behavior is that for smaller values
near zero, there simply are not enough integers smaller than the mean, than
integers that are larger.
-Therefore to accommodate all possibilities/counts, it has to be strongly
skewed to the positive when the mean is small.
-For more on Skewness, see @ref{Skewness caused by signal and its measurement}.
+Recall that if you want your image pixels to have the same values as the
@code{MEAN} column in your profile, you should run MakeProfiles with
@option{--mcolnocustprof}.
-@cindex Compare Poisson and Gaussian
-As @mymath{\lambda} becomes larger, the distribution becomes more and more
symmetric, and the variance of that distribution is equal to its mean.
-In other words, the standard deviation is the square root of the mean.
-It can also be proved that when the mean is large, say @mymath{\lambda>1000},
the Poisson distribution approaches the
@url{https://en.wikipedia.org/wiki/Normal_distribution, Normal (Gaussian)
distribution} with mean @mymath{\mu=\lambda} and standard deviation
@mymath{\sigma=\sqrt{\lambda}}.
-In other words, a Poisson distribution (with a sufficiently large
@mymath{\lambda}) is simply a Gaussian that has one free parameter
(@mymath{\mu=\lambda} and @mymath{\sigma=\sqrt{\lambda}}), instead of the two
parameters that the Gaussian distribution originally has (independent
@mymath{\mu} and @mymath{\sigma}).
+@item --customtablehdu INT/STR
+The HDU/extension in the FITS file given to @option{--customtable}.
-@cindex Sky value
-@cindex Background flux
-@cindex Undetected objects
-In real situations, the photons/flux from our targets are combined with
photons from a certain background (observationally, the @emph{Sky} value).
-The Sky value is defined to be the average flux of a region in the dataset
with no targets.
-Its physical origin can be the brightness of the atmosphere (for ground-based
instruments), possible stray light within the imaging instrument, the average
flux of undetected targets, etc.
-The Sky value is thus an ideal definition, because in real datasets, what lies
deep in the noise (far lower than the detection limit) is never
known@footnote{In a real image, a relatively large number of very faint objects
can be fully buried in the noise and never detected.
-These undetected objects will bias the background measurement to slightly
larger values.
-Our best approximation is thus to simply assume they are uniform, and consider
their average effect.
-See Figure 1 (a.1 and a.2) and Section 2.2 in
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}.}.
-To account for all of these, the sky value is defined to be the average
count/value of the undetected regions in the image.
-In a mock image/dataset, we have the luxury of setting the background (Sky)
value.
+@item --customimg=STR[,STR]
+A custom FITS image that should be used for the @code{custom-img} profiles
(see the description of @option{--fcol} in @ref{MakeProfiles catalog}).
+Multiple files can be given to this option (separated by a comma), and this
option can be called multiple times itself (useful when many custom image
profiles should be added).
+If the HDU of the images are different, you can use @option{--customimghdu}
(described below).
-@cindex Simulating noise
-@cindex Noise simulation
-In summary, the value in each element of the dataset (pixel in an image) is
the sum of contributions from various galaxies and stars (after convolution by
the PSF, see @ref{PSF}).
-Let's name the convolved sum of possibly overlapping objects in each pixel as
@mymath{I_{nn}}.
-@mymath{nn} represents `no noise'.
-For now, let's assume the background (@mymath{B}) is constant and sufficiently
high for the Poisson distribution to be approximated by a Gaussian.
-Then the flux of that pixel, after adding noise, is @emph{a random value}
taken from a Gaussian distribution with the following mean (@mymath{\mu}) and
standard deviation (@mymath{\sigma}):
+Through the ``radius'' column, MakeProfiles will know which one of the images
given to this option should be used in each row.
+For example, let's assume your input catalog (@file{cat.fits}) has the
following contents (output of first command below), and you call MakeProfiles
like the second command below to insert four profiles into the background
@file{back.fits} image.
-@dispmath{\mu=B+I_{nn}, \quad \sigma=\sqrt{B+I_{nn}}}
+The first profile below is Sersic (with an @option{--fcol}, or 4-th column,
code of @code{1}).
+So MakeProfiles builds the pixels of the first profile, and all column values
are meaningful.
+However, the second, third and fourth inserted objects are custom images (with
an @option{--fcol} code of @code{10}).
+For the custom image profiles, you see that the radius column has values of
@code{1} or @code{2}.
+This tells MakeProfiles to use the first image given to @option{--customimg}
(or @file{gal-1.fits}) for the second and fourth inserted objects.
+The second image given to @option{--customimage} (or @file{gal-2.fits}) will
be used for the third inserted object.
+Finally, all three custom image profiles have different magnitudes, and the
values in @option{--ncol}, @option{--pcol}, @option{--qcol} and @option{--tcol}
are ignored.
-@cindex Bias level in detectors
-@cindex Dark level in detectors
-In astronomical instruments, @mymath{B} is enhanced by adding a ``bias'' level
to each pixel before the shutter is even opened (for the exposure to start).
-As the exposure is ongoing and photo-electrons are accumulating from the
astronomical objects, a ``dark'' current (due to thermal radiation of the
instrument) also builds up in the pixels.
-The ``dark'' current will accumulate even when the shutter is closed, but the
CCD electronics are working (hence the name ``dark'').
-This added dark level further enhances the mean value in a real observation
compared to the raw background value (from the atmosphere for example).
+@example
+$ cat cat.fits
+1 53.15506 -27.785165 1 20 1 20 0.6 25 5
+2 53.15602 -27.777887 10 1 0 0 0 22 0
+3 53.16440 -27.775876 10 2 0 0 0 24 0
+4 53.16849 -27.787406 10 1 0 0 0 23 0
-Since this type of noise is inherent in the objects we study, it is usually
measured on the same scale as the astronomical objects, namely the magnitude
system, see @ref{Brightness flux magnitude}.
-It is then internally converted to the flux scale for further processing.
+$ astmkprof cat.fits --mode=wcs --zeropoint=25.68 \
+ --background=back.fits --output=out.fits \
+ --customimg=gal-1.fits --customimg=gal-2.fits
+@end example
-The equations above clearly show the importance of the background value and
its effect on the final signal to noise ratio in each pixel of a science image.
-It is therefore, one of the most important factors in understanding the noise
(and properly simulating observations where necessary).
-An inappropriately bright background value can hide the signal of the mock
profile hide behind the noise.
-In other words, a brighter background has larger standard deviation and vice
versa.
-As a result, the only necessary parameter to define photon-counting noise over
a mock image of simulated profiles is the background.
-For a complete example, see @ref{Sufi simulates a detection}.
+@item --customimghdu=INT/STR
+The HDU(s) of the images given to @option{--customimghdu}.
+If this option is only called once, but @option{--customimg} is called many
times, MakeProfiles will assume that all images given to @option{--customimg}
have the same HDU.
+Otherwise (if the number of HDUs is equal to the number of images), then each
image will use its corresponding HDU.
-To better understand the correlation between the mean (or background) value
and the noise standard deviation, let's use an analogy.
-Consider the profile of your galaxy to be analogous to the profile of a ship
that is sailing in the sea.
-The height of the ship would therefore be analogous to the maximum flux
difference between your galaxy's minimum and maximum values.
-Furthermore, let's take the depth of the sea to represent the background
value: a deeper sea, corresponds to a brighter background.
-In this analogy, the ``noise'' would be the height of the waves that surround
the ship: in deeper waters, the waves would also be taller (the square root of
the mean depth at the ship's position).
+@item -X INT,INT
+@itemx --shift=INT,INT
+Shift all the profiles and enlarge the image along each dimension.
+To better understand this option, please see @mymath{n} in @ref{If convolving
afterwards}.
+This is useful when you want to convolve the image afterwards.
+If you are using an external PSF, be sure to oversample it to the same scale
used for creating the mock images.
+If a background image is specified, any possible value to this option is
ignored.
-If the ship is in deep waters, the height of waves are greater than when the
ship is near to the beach (at lower depths).
-Therefore, when the ship is in the middle of the sea, there are high waves
that are capable of hiding a significant part of the ship from our perspective.
-This corresponds to a brighter background value in astronomical images: the
resulting noise from that brighter background can completely wash out the
signal from a fainter galaxy, star or solar system object.
+@item -c
+@itemx --prepforconv
+Shift all the profiles and enlarge the image based on half the width of the
first Moffat or Gaussian profile in the catalog, considering any possible
oversampling see @ref{If convolving afterwards}.
+@option{--prepforconv} is only checked and possibly activated if
@option{--xshift} and @option{--yshift} are both zero (after reading the
command-line and configuration files).
+If a background image is specified, any possible value to this option is
ignored.
-@node Instrumental noise, Final noised pixel value, Photon counting noise,
Noise basics
-@subsubsection Instrumental noise
+@item -z FLT
+@itemx --zeropoint=FLT
+The zero point magnitude of the input.
+For more on the zero point magnitude, see @ref{Brightness flux magnitude}.
-@cindex Readout noise
-@cindex Instrumental noise
-@cindex Noise, instrumental
-While taking images with a camera, a bias current is fed to the pixels, the
variation of the value of this bias current over the pixels, also adds to the
final image noise.
-Another source of noise is the readout noise that is produced by the
electronics in the detector.
-Specifically, the parts that attempt to digitize the voltage produced by the
photo-electrons in the analog to digital converter.
-With the current generation of instruments, this source of noise is not as
significant as the noise due to the background Sky discussed in @ref{Photon
counting noise}.
+@item -w FLT
+@itemx --circumwidth=FLT
+The width of the circumference if the profile is to be an elliptical
circumference or annulus.
+See the explanations for this type of profile in @option{--fcol}.
-Let @mymath{C} represent the combined standard deviation of all these
instrumental sources of noise.
-When only this source of noise is present, the noised pixel value would be a
random value chosen from a Gaussian distribution with
+@item -R
+@itemx --replace
+Do not add the pixels of each profile over the background, or other profiles.
+But replace the values.
-@dispmath{\mu=I_{nn}, \quad \sigma=\sqrt{C^2+I_{nn}}}
+By default, when two profiles overlap, the final pixel value is the sum of all
the profiles that overlap on that pixel.
+This is the expected situation when dealing with physical object profiles like
galaxies or stars/PSF.
+However, when MakeProfiles is used to build integer labeled images (for
example, in @ref{Aperture photometry}), this is not the expected situation: the
sum of two labels will be a new label.
+With this option, the pixels are not added but the largest (maximum) value
over that pixel is used.
+Because the maximum operator is independent of the order of values, the output
is also thread-safe.
-@cindex ADU
-@cindex Gain
-@cindex Counts
-This type of noise is independent of the signal in the dataset, it is only
determined by the instrument.
-So the flux scale (and not magnitude scale) is most commonly used for this
type of noise.
-In practice, this value is usually reported in analog-to-digital units or
ADUs, not flux or electron counts.
-The gain value of the device can be used to convert between these two, see
@ref{Brightness flux magnitude}.
+@end table
-@node Final noised pixel value, Generating random numbers, Instrumental noise,
Noise basics
-@subsubsection Final noised pixel value
-Based on the discussions in @ref{Photon counting noise} and @ref{Instrumental
noise}, depending on the values you specify for @mymath{B} and @mymath{C} from
the above, the final noised value for each pixel is a random value chosen from
a Gaussian distribution with
+@node MakeProfiles output dataset, MakeProfiles log file, MakeProfiles profile
settings, Invoking astmkprof
+@subsubsection MakeProfiles output dataset
+MakeProfiles takes an input catalog uses basic properties that are defined
there to build a dataset, for example, a 2D image containing the profiles in
the catalog.
+In @ref{MakeProfiles catalog} and @ref{MakeProfiles profile settings}, the
catalog and profile settings were discussed.
+The options of this section, allow you to configure the output dataset (or the
canvas that will host the built profiles).
-@dispmath{\mu=B+I_{nn}, \quad \sigma=\sqrt{C^2+B+I_{nn}}}
+@table @option
+@item -k FITS
+@itemx --background=FITS
+A background image FITS file to build the profiles on.
+The extension that contains the image should be specified with the
@option{--backhdu} option, see below.
+When a background image is specified, it will be used to derive all the
information about the output image.
+Hence, the following options will be ignored: @option{--mergedsize},
@option{--oversample}, @option{--crpix}, @option{--crval} (generally, all other
WCS related parameters) and the output's data type (see @option{--type} in
@ref{Input output options}).
+The background image will act like a canvas to build the profiles on: profile
pixel values will be summed with the background image pixel values.
+With the @option{--replace} option you can disable this behavior and replace
the profile pixels with the background pixels.
+If you want to use all the image information above, except for the pixel
values (you want to have a blank canvas to build the profiles on, based on an
input image), you can call @option{--clearcanvas}, to set all the input image's
pixels to zero before starting to build the profiles over it (this is done in
memory after reading the input, so nothing will happen to your input file).
-@node Generating random numbers, , Final noised pixel value, Noise basics
-@subsubsection Generating random numbers
+@item -B STR/INT
+@itemx --backhdu=STR/INT
+The header data unit (HDU) of the file given to @option{--background}.
-@cindex Random numbers
-@cindex Numbers, random
-As discussed above, to generate noise we need to make random samples of a
particular distribution.
-So it is important to understand some general concepts regarding the
generation of random numbers.
-For a very complete and nice introduction we strongly advise reading Donald
Knuth's ``The art of computer programming'', volume 2, chapter
3@footnote{Knuth, Donald. 1998.
-The art of computer programming. Addison--Wesley. ISBN 0-201-89684-2 }.
-Quoting from the GNU Scientific Library manual, ``If you do not own it, you
should stop reading right now, run to the nearest bookstore, and buy
it''@footnote{For students, running to the library might be more affordable!}!
+@item -C
+@itemx --clearcanvas
+When an input image is specified (with the @option{--background} option, set
all its pixels to 0.0 immediately after reading it into memory.
+Effectively, this will allow you to use all its properties (described under
the @option{--background} option), without having to worry about the pixel
values.
-@cindex Psuedo-random numbers
-@cindex Numbers, psuedo-random
-Using only software, we can only produce what is called a psuedo-random
sequence of numbers.
-A true random number generator is a hardware (let's assume we have made sure
it has no systematic biases), for example, throwing dice or flipping coins
(which have remained from the ancient times).
-More modern hardware methods use atmospheric noise, thermal noise or other
types of external electromagnetic or quantum phenomena.
-All pseudo-random number generators (software) require a seed to be the basis
of the generation.
-The advantage of having a seed is that if you specify the same seed for
multiple runs, you will get an identical sequence of random numbers which
allows you to reproduce the same final noised image.
+@option{--clearcanvas} can come in handy in many situations, for example, if
you want to create a labeled image (segmentation map) for creating a catalog
(see @ref{MakeCatalog}).
+In other cases, you might have modeled the objects in an image and want to
create them on the same frame, but without the original pixel values.
-@cindex Environment variables
-@cindex GNU Scientific Library
-The programs in GNU Astronomy Utilities (for example, MakeNoise or
MakeProfiles) use the GNU Scientific Library (GSL) to generate random numbers.
-GSL allows the user to set the random number generator through environment
variables, see @ref{Installation directory} for an introduction to environment
variables.
-In the chapter titled ``Random Number Generation'' they have fully explained
the various random number generators that are available (there are a lot of
them!).
-Through the two environment variables @code{GSL_RNG_TYPE} and
@code{GSL_RNG_SEED} you can specify the generator and its seed respectively.
+@item -E STR/INT,FLT[,FLT,[...]]
+@itemx --kernel=STR/INT,FLT[,FLT,[...]]
+Only build one kernel profile with the parameters given as the values to this
option.
+The different values must be separated by a comma (@key{,}).
+The first value identifies the radial function of the profile, either through
a string or through a number (see description of @option{--fcol} in
@ref{MakeProfiles catalog}).
+Each radial profile needs a different total number of parameters: S@'ersic and
Moffat functions need 3 parameters: radial, S@'ersic index or Moffat
@mymath{\beta}, and truncation radius.
+The Gaussian function needs two parameters: radial and truncation radius.
+The point function does not need any parameters and flat and circumference
profiles just need one parameter (truncation radius).
-@cindex Seed, Random number generator
-@cindex Random number generator, Seed
-If you do not specify a value for @code{GSL_RNG_TYPE}, GSL will use its
default random number generator type.
-The default type is sufficient for most general applications.
-If no value is given for the @code{GSL_RNG_SEED} environment variable and you
have asked Gnuastro to read the seed from the environment (through the
@option{--envseed} option), then GSL will use the default value of each
generator to give identical outputs.
-If you do not explicitly tell Gnuastro programs to read the seed value from
the environment variable, then they will use the system time (accurate to
within a microsecond) to generate (apparently random) seeds.
-In this manner, every time you run the program, you will get a different
random number distribution.
+The PSF or kernel is a unique (and highly constrained) type of profile: the
sum of its pixels must be one, its center must be the center of the central
pixel (in an image with an odd number of pixels on each side), and commonly it
is circular, so its axis ratio and position angle are one and zero respectively.
+Kernels are commonly necessary for various data analysis and data manipulation
steps (for example, see @ref{Convolve}, and @ref{NoiseChisel}.
+Because of this it is inconvenient to define a catalog with one row and many
zero valued columns (for all the non-necessary parameters).
+Hence, with this option, it is possible to create a kernel with MakeProfiles
without the need to create a catalog.
+Here are some examples:
-There are two ways you can specify values for these environment variables.
-You can call them on the same command-line for example:
+@table @option
+@item --kernel=moffat,3,2.8,5
+A Moffat kernel with FWHM of 3 pixels, @mymath{\beta=2.8} which is truncated
at 5 times the FWHM.
-@example
-$ GSL_RNG_TYPE="taus" GSL_RNG_SEED=345 astmknoise input.fits
-@end example
+@item --kernel=gaussian,2,3
+A circular Gaussian kernel with FWHM of 2 pixels and truncated at 3 times
+the FWHM.
+@end table
-@noindent
-In this manner the values will only be used for this particular execution of
MakeNoise.
-Alternatively, you can define them for the full period of your terminal
session or script length, using the shell's @command{export} command with the
two separate commands below (for a script remove the @code{$} signs):
+This option may also be used to create a 3D kernel.
+To do that, two small modifications are necessary: add a @code{-3d} (or
@code{-3D}) to the profile name (for example, @code{moffat-3d}) and add a
number (axis-ratio along the third dimension) to the end of the parameters for
all profiles except @code{point}.
+The main reason behind providing an axis ratio in the third dimension is that
in 3D astronomical datasets, commonly the third dimension does not have the
same nature (units/sampling) as the first and second.
-@example
-$ export GSL_RNG_TYPE="taus"
-$ export GSL_RNG_SEED=345
-@end example
+For example, in IFU (optical) or Radio data cubes, the first and second
dimensions are commonly spatial/angular positions (like RA and Dec) but the
third dimension is wavelength or frequency (in units of Angstroms for Herz).
+Because of this different nature (which also affects the processing), it may
be necessary for the kernel to have a different extent in that direction.
-@cindex Startup scripts
-@cindex @file{.bashrc}
-@noindent
-The subsequent programs which use GSL's random number generators will hence
forth use these values in this session of the terminal you are running or while
executing this script.
-In case you want to set fixed values for these parameters every time you use
the GSL random number generator, you can add these two lines to your
@file{.bashrc} startup script@footnote{Do Not forget that if you are going to
give your scripts (that use the GSL random number generator) to others you have
to make sure you also tell them to set these environment variable separately.
-So for scripts, it is best to keep all such variable definitions within the
script, even if they are within your @file{.bashrc}.}, see @ref{Installation
directory}.
+If the 3rd dimension axis ratio is equal to @mymath{1.0}, then the kernel will
be a spheroid.
+If it is smaller than @mymath{1.0}, the kernel will be button-shaped: extended
less in the third dimension.
+However, when it islarger than @mymath{1.0}, the kernel will be bullet-shaped:
extended more in the third dimension.
+In the latter case, the radial parameter will correspond to the length along
the 3rd dimension.
+For example, let's have a look at the two examples above but in 3D:
-@strong{IMPORTANT NOTE:} If the two environment variables @code{GSL_RNG_TYPE}
and @code{GSL_RNG_SEED} are defined, GSL will report them by default, even if
you do not use the @option{--envseed} option.
-For example, see this call to MakeProfiles:
+@table @option
+@item --kernel=moffat-3d,3,2.8,5,0.5
+An ellipsoid Moffat kernel with FWHM of 3 pixels, @mymath{\beta=2.8} which is
truncated at 5 times the FWHM.
+The ellipsoid is circular in the first two dimensions, but in the third
dimension its extent is half the first two.
-@example
-$ export GSL_RNG_TYPE=taus
-$ export GSL_RNG_SEED=345
-$ astmkprof -s1 --kernel=gaussian,2,5
-GSL_RNG_TYPE=taus
-GSL_RNG_SEED=345
-MakeProfiles V.VV started on DDD MMM DDD HH:MM:SS YYYY
- - Building one gaussian kernel
- - Random number generator (RNG) type: taus
- - Basic RNG seed: 1618960836
- ---- ./kernel.fits created.
- -- Output: ./kernel.fits
-MakeProfiles finished in 0.068945 seconds
-@end example
+@item --kernel=gaussian-3d,2,3,1
+A spherical Gaussian kernel with FWHM of 2 pixels and truncated at 3 times
+the FWHM.
+@end table
-@noindent
-@cindex Seed, Random number generator
-@cindex Random number generator, Seed
-The first two output lines (showing the names and values of the GSL
environment variables) are printed by GSL before MakeProfiles actually starts
generating random numbers.
-Gnuastro's programs will report the actual values they use independently
(after the name of the program), you should check them for the final values
used, not GSL's printed values.
-In the example above, did you notice how the random number generator seed
above is different between GSL and MakeProfiles?
-However, if @option{--envseed} was given, both printed seeds would be the same.
+Of course, if a specific kernel is needed that does not fit the constraints
imposed by this option, you can always use a catalog to define any arbitrary
kernel.
+Just call the @option{--individual} and @option{--nomerged} options to make
sure that it is built as a separate file (individually) and no ``merged'' image
of the input profiles is created.
+@item -x INT,INT
+@itemx --mergedsize=INT,INT
+The number of pixels along each axis of the output, in FITS order.
+This is before over-sampling.
+For example, if you call MakeProfiles with @option{--mergedsize=100,150
--oversample=5} (assuming no shift due for later convolution), then the final
image size along the first axis will be 500 by 750 pixels.
+Fractions are acceptable as values for each dimension, however, they must
reduce to an integer, so @option{--mergedsize=150/3,300/3} is acceptable but
@option{--mergedsize=150/4,300/4} is not.
-@node Invoking astmknoise, , Noise basics, MakeNoise
-@subsection Invoking MakeNoise
+When viewing a FITS image in DS9, the first FITS dimension is in the
horizontal direction and the second is vertical.
+As an example, the image created with the example above will have 500 pixels
horizontally and 750 pixels vertically.
-MakeNoise will add noise to an existing image.
-The executable name is @file{astmknoise} with the following general template
+If a background image is specified, this option is ignored.
-@example
-$ astmknoise [OPTION ...] InputImage.fits
-@end example
+@item -s INT
+@itemx --oversample=INT
+The scale to over-sample the profiles and final image.
+If not an odd number, will be added by one, see @ref{Oversampling}.
+Note that this @option{--oversample} will remain active even if an input image
is specified.
+If your input catalog is based on the background image, be sure to set
@option{--oversample=1}.
-@noindent
-One line examples:
+@item --psfinimg
+Build the possibly existing PSF profiles (Moffat or Gaussian) in the catalog
into the final image.
+By default they are built separately so you can convolve your images with
them, thus their magnitude and positions are ignored.
+With this option, they will be built in the final image like every other
galaxy profile.
+To have a final PSF in your image, make a point profile where you want the PSF
and after convolution it will be the PSF.
-@example
-## Add noise with a standard deviation of 100 to image.
-## (this is independent of the pixel value: not Poission noise)
-$ astmknoise --sigma=100 image.fits
+@item -i
+@itemx --individual
+@cindex Individual profiles
+@cindex Build individual profiles
+If this option is called, each profile is created in a separate FITS file
within the same directory as the output and the row number of the profile
(starting from zero) in the name.
+The file for each row's profile will be in the same directory as the final
combined image of all the profiles and will have the final image's name as a
suffix.
+So for example, if the final combined image is named
@file{./out/fromcatalog.fits}, then the first profile that will be created with
this option will be named @file{./out/0_fromcatalog.fits}.
-## Add noise to the input image assuming a per-pixel background
-## magnitude (with zero point magnitude of 0) and an
-## instrumental noise of 20.
-$ astmknoise --background=-10 -z0 --instrumental=20 mockimage.fits
-@end example
+Since each image only has one full profile out to the truncation radius the
profile is centered and so, only the sub-pixel position of the profile center
is important for the outputs of this option.
+The output will have an odd number of pixels.
+If there is no oversampling, the central pixel will contain the profile center.
+If the value to @option{--oversample} is larger than unity, then the profile
center is on any of the central @option{--oversample}'d pixels depending on the
fractional value of the profile center.
-@noindent
-If actual processing is to be done, the input image is a mandatory argument.
-The full list of options common to all the programs in Gnuastro can be seen in
@ref{Common options}.
-The type (see @ref{Numeric data types}) of the output can be specified with
the @option{--type} option, see @ref{Input output options}.
-The header of the output FITS file keeps all the parameters that were
influential in making it.
-This is done for future reproducibility.
+If the fractional value is larger than half, it is on the bottom half of the
central region.
+This is due to the FITS definition of a real number position: The center of a
pixel has fractional value @mymath{0.00} so each pixel contains these
fractions: .5 -- .75 -- .00 (pixel center) -- .25 -- .5.
-@table @option
+@item -m
+@itemx --nomerged
+Do Not make a merged image.
+By default after making the profiles, they are added to a final image with
side lengths specified by @option{--mergedsize} if they overlap with it.
-@item -b FLT
-@itemx --background=FLT
-The background value (per pixel) that will be added to each pixel value
(internally) to simulate Poisson noise, see @ref{Photon counting noise}.
-By default the units of this value are assumed to be in magnitudes, hence a
@option{--zeropoint} is also necessary.
-If the background is in units of counts, you need add
@option{--bgisbrightness}, see @ref{Brightness flux magnitude}.
+@end table
-Internally, the value given to this option will be converted to counts
(@mymath{b}, when @option{--bgnotmag} is called, the value will be used
directly).
-Assuming the pixel value is @mymath{p}, the random value for that pixel will
be taken from a Gaussian distribution with mean of @mymath{p+b} and standard
deviation of @mymath{\sqrt{p+b}}.
-With this option, the noise will therefore be dependent on the pixel values:
according to the Poission noise model, as the pixel value becomes larger, its
noise will also become larger.
-This is thus a realistic way to model noise, see @ref{Photon counting noise}.
-@item -B
-@itemx --bgnotmag
-The value given to @option{--background} should not be interpreted as a
magnitude, but the raw pixel units (usually counts).
+@noindent
+The options below can be used to define the world coordinate system (WCS)
properties of the MakeProfiles outputs.
+The option names are deliberately chosen to be the same as the FITS standard
WCS keywords.
+See Section 8 of @url{https://doi.org/10.1051/0004-6361/201015362, Pence et al
[2010]} for a short introduction to WCS in the FITS standard@footnote{The world
coordinate standard in FITS is a very beautiful and powerful concept to
link/associate datasets with the outside world (other datasets).
+The description in the FITS standard (link above) only touches the tip of the
ice-burg.
+To learn more please see @url{https://doi.org/10.1051/0004-6361:20021326,
Greisen and Calabretta [2002]},
@url{https://doi.org/10.1051/0004-6361:20021327, Calabretta and Greisen
[2002]}, @url{https://doi.org/10.1051/0004-6361:20053818, Greisen et al.
[2006]}, and
@url{http://www.atnf.csiro.au/people/mcalabre/WCS/dcs_20040422.pdf, Calabretta
et al.}}.
-@item -z FLT
-@itemx --zeropoint=FLT
-The zero point magnitude used to convert the value of @option{--background}
(in units of magnitude) to flux, see @ref{Brightness flux magnitude}.
+If you look into the headers of a FITS image with WCS for example, you will
see all these names but in uppercase and with numbers to represent the
dimensions, for example, @code{CRPIX1} and @code{PC2_1}.
+You can see the FITS headers with Gnuastro's @ref{Fits} program using a
command like this: @command{$ astfits -p image.fits}.
-@item -i FLT
-@itemx --instrumental=FLT
-The instrumental noise which is in units of flux, see @ref{Instrumental noise}.
+If the values given to any of these options does not correspond to the number
of dimensions in the output dataset, then no WCS information will be added.
+Also recall that if you use the @option{--background} option, all of these
options are ignored.
+Such that if the image given to @option{--background} does not have any WCS,
the output of MakeProfiles will also not have any WCS, even if these options
are given@footnote{If you want to add profiles @emph{and} WCS over the
background image (to produce your output), you need more than one command:
+1. You should use @option{--mergedsize} in MakeProfiles to manually set the
output number of pixels equal to your desired background image (so the
background is zero).
+In this mode, you can use these WCS-related options to define the WCS.
+2. Then use Arithmetic to add the pixels of your mock image to the background
(see @ref{Arithmetic}.}.
-@item -s FLT
-@item --sigma=FLT
-The total noise sigma in the same units as the pixel values.
-With this option, the @option{--background}, @option{--zeropoint} and
@option{--instrumental} will be ignored.
-With this option, the noise will be independent of the pixel values (which is
not realistic, see @ref{Photon counting noise}).
-Hence it is only useful if you are working on low surface brightness regions
where the change in pixel value (and thus real noise) is insignificant.
+@table @option
-Generally, @strong{usage of this option is discouraged} unless you understand
the risks of not simulating real noise.
-This is because with this option, you will not get Poisson noise (the common
noise model for astronomical imaging), where the noise varies based on pixel
value.
-Use @option{--background} for adding Poission noise.
+@item --crpix=FLT,FLT
+The pixel coordinates of the WCS reference point.
+Fractions are acceptable for the values of this option.
-@item -e
-@itemx --envseed
-@cindex Seed, Random number generator
-@cindex Random number generator, Seed
-Use the @code{GSL_RNG_SEED} environment variable for the seed used in the
random number generator, see @ref{Generating random numbers}.
-With this option, the output image noise is always going to be identical (or
reproducible).
+@item --crval=FLT,FLT
+The WCS coordinates of the Reference point.
+Fractions are acceptable for the values of this option.
+The comma-separated values can either be in degrees (a single number), or
sexagesimal (@code{_h_m_} for RA, @code{_d_m_} for Dec, or @code{_:_:_} for
both).
+In any case, the final value that will be written in the @code{CRVAL} keyword
will be a floating point number in degrees (according to the FITS standard).
-@item -d
-@itemx --doubletype
-Save the output in the double precision floating point format that was used
internally.
-This option will be most useful if the input images were of integer types.
+@item --cdelt=FLT,FLT
+The resolution (size of one data-unit or pixel in WCS units) of the
non-oversampled dataset.
+Fractions are acceptable for the values of this option.
-@end table
+@item --pc=FLT,FLT,FLT,FLT
+The PC matrix of the WCS rotation, see the FITS standard (link above) to
better understand the PC matrix.
+@item --cunit=STR,STR
+The units of each WCS axis, for example, @code{deg}.
+Note that these values are part of the FITS standard (link above).
+MakeProfiles will not complain if you use non-standard values, but later usage
of them might cause trouble.
+@item --ctype=STR,STR
+The type of each WCS axis, for example, @code{RA---TAN} and @code{DEC--TAN}.
+Note that these values are part of the FITS standard (link above).
+MakeProfiles will not complain if you use non-standard values, but later usage
of them might cause trouble.
+@end table
+@node MakeProfiles log file, , MakeProfiles output dataset, Invoking astmkprof
+@subsubsection MakeProfiles log file
+Besides the final merged dataset of all the profiles, or the individual
datasets (see @ref{MakeProfiles output dataset}), if the @option{--log} option
is called MakeProfiles will also create a log file in the current directory
(where you run MockProfiles).
+See @ref{Common options} for a full description of @option{--log} and other
options that are shared between all Gnuastro programs.
+The values for each column are explained in the first few commented lines of
the log file (starting with @command{#} character).
+Here is a more complete description.
+@itemize
+@item
+An ID (row number of profile in input catalog).
+@item
+The total magnitude of the profile in the output dataset.
+When the profile does not completely overlap with the output dataset, this
will be different from your input magnitude.
+@item
+The number of pixels (in the oversampled image) which used Monte Carlo
integration and not the central pixel value, see @ref{Sampling from a function}.
+@item
+The fraction of flux in the Monte Carlo integrated pixels.
+@item
+If an individual image was created, this column will have a value of @code{1},
otherwise it will have a value of @code{0}.
+@end itemize
@@ -29941,1155 +30144,1142 @@ This option will be most useful if the input
images were of integer types.
-@node High-level calculations, Installed scripts, Data modeling, Top
-@chapter High-level calculations
-After the reduction of raw data (for example, with the programs in @ref{Data
manipulation}) you will have reduced images/data ready for processing/analyzing
(for example, with the programs in @ref{Data analysis}).
-But the processed/analyzed data (or catalogs) are still not enough to derive
any scientific result.
-Even higher-level analysis is still needed to convert the observed magnitudes,
sizes or volumes into physical quantities that we associate with each catalog
entry or detected object which is the purpose of the tools in this section.
+@node MakeNoise, , MakeProfiles, Data modeling
+@section MakeNoise
+@cindex Noise
+Real data are always buried in noise, therefore to finalize a simulation of
real data (for example, to test our observational algorithms) it is essential
to add noise to the mock profiles created with MakeProfiles, see
@ref{MakeProfiles}.
+Below, the general principles and concepts to help understand how noise is
quantified is discussed.
+MakeNoise options and argument are then discussed in @ref{Invoking astmknoise}.
@menu
-* CosmicCalculator:: Calculate cosmological variables
+* Noise basics:: Noise concepts and definitions.
+* Invoking astmknoise:: Options and arguments to MakeNoise.
@end menu
-@node CosmicCalculator, , High-level calculations, High-level calculations
-@section CosmicCalculator
-To derive higher-level information regarding our sources in extra-galactic
astronomy, cosmological calculations are necessary.
-In Gnuastro, CosmicCalculator is in charge of such calculations.
-Before discussing how CosmicCalculator is called and operates (in
@ref{Invoking astcosmiccal}), it is important to provide a rough but mostly
self sufficient review of the basics and the equations used in the analysis.
-In @ref{Distance on a 2D curved space} the basic idea of understanding
distances in a curved and expanding 2D universe (which we can visualize) are
reviewed.
-Having solidified the concepts there, in @ref{Extending distance concepts to
3D}, the formalism is extended to the 3D universe we are trying to study in our
research.
-The focus here is obtaining a physical insight into these equations (mainly
for the use in real observational studies).
-There are many books thoroughly deriving and proving all the equations with
all possible initial conditions and assumptions for any abstract universe,
interested readers can study those books.
+@node Noise basics, Invoking astmknoise, MakeNoise, MakeNoise
+@subsection Noise basics
+
+@cindex Noise
+@cindex Image noise
+Deep astronomical images, like those used in extragalactic studies, seriously
suffer from noise in the data.
+Generally speaking, the sources of noise in an astronomical image are photon
counting noise and Instrumental noise which are discussed in @ref{Photon
counting noise} and @ref{Instrumental noise}.
+This review finishes with @ref{Generating random numbers} which is a short
introduction on how random numbers are generated.
+We will see that while software random number generators are not perfect, they
allow us to obtain a reproducible series of random numbers through setting the
random number generator function and seed value.
+Therefore in this section, we will also discuss how you can set these two
parameters in Gnuastro's programs (including MakeNoise).
@menu
-* Distance on a 2D curved space:: Distances in 2D for simplicity.
-* Extending distance concepts to 3D:: Going to 3D (our real universe).
-* Invoking astcosmiccal:: How to run CosmicCalculator.
+* Photon counting noise:: Poisson noise
+* Instrumental noise:: Readout, dark current and other sources.
+* Final noised pixel value:: How the final noised value is calculated.
+* Generating random numbers:: How random numbers are generated.
@end menu
-@node Distance on a 2D curved space, Extending distance concepts to 3D,
CosmicCalculator, CosmicCalculator
-@subsection Distance on a 2D curved space
-
-The observations to date (for example, the Planck 2015 results), have not
measured@footnote{The observations are interpreted under the assumption of
uniform curvature.
-For a relativistic alternative to dark energy (and maybe also some part of
dark matter), non-uniform curvature may be even be more critical, but that is
beyond the scope of this brief explanation.} the presence of significant
curvature in the universe.
-However to be generic (and allow its measurement if it does in fact exist), it
is very important to create a framework that allows non-zero uniform curvature.
-However, this section is not intended to be a fully thorough and
mathematically complete derivation of these concepts.
-There are many references available for such reviews that go deep into the
abstract mathematical proofs.
-The emphasis here is on visualization of the concepts for a beginner.
+@node Photon counting noise, Instrumental noise, Noise basics, Noise basics
+@subsubsection Photon counting noise
-As 3D beings, it is difficult for us to mentally create (visualize) a picture
of the curvature of a 3D volume.
-Hence, here we will assume a 2D surface/space and discuss distances on that 2D
surface when it is flat and when it is curved.
-Once the concepts have been created/visualized here, we will extend them, in
@ref{Extending distance concepts to 3D}, to a real 3D spatial @emph{slice} of
the Universe we live in and hope to study.
+@cindex Counting error
+@cindex de Moivre, Abraham
+@cindex Poisson distribution
+@cindex Photon counting noise
+@cindex Poisson, Sim@'eon Denis
+With the very accurate electronics used in today's detectors, photon counting
noise@footnote{In practice, we are actually counting the electrons that are
produced by each photon, not the actual photons.} is the most significant
source of uncertainty in most datasets.
+To understand this noise (error in counting) and its effect on the images of
astronomical targets, let's start by reviewing how a distribution produced by
counting can be modeled as a parametric function.
-To be more understandable (actively discuss from an observer's point of view)
let's assume there's an imaginary 2D creature living on the 2D space (which
@emph{might} be curved in 3D).
-Here, we will be working with this creature in its efforts to analyze
distances in its 2D universe.
-The start of the analysis might seem too mundane, but since it is difficult to
imagine a 3D curved space, it is important to review all the very basic
concepts thoroughly for an easy transition to a universe that is more difficult
to visualize (a curved 3D space embedded in 4D).
+Counting is an inherently discrete operation, which can only produce positive
integer outputs (including zero).
+For example, we cannot count @mymath{3.2} or @mymath{-2} of anything.
+We only count @mymath{0}, @mymath{1}, @mymath{2}, @mymath{3} and so on.
+The distribution of values, as a result of counting efforts is formally known
as the @url{https://en.wikipedia.org/wiki/Poisson_distribution, Poisson
distribution}.
+It is associated to Sim@'eon Denis Poisson, because he discussed it while
working on the number of wrongful convictions in court cases in his 1837
book@footnote{[From Wikipedia] Poisson's result was also derived in a previous
study by Abraham de Moivre in 1711.
+Therefore some people suggest it should rightly be called the de Moivre
distribution.}.
-To start, let's assume a static (not expanding or shrinking), flat 2D surface
similar to @ref{flatplane} and that the 2D creature is observing its universe
from point @mymath{A}.
-One of the most basic ways to parameterize this space is through the Cartesian
coordinates (@mymath{x}, @mymath{y}).
-In @ref{flatplane}, the basic axes of these two coordinates are plotted.
-An infinitesimal change in the direction of each axis is written as
@mymath{dx} and @mymath{dy}.
-For each point, the infinitesimal changes are parallel with the respective
axes and are not shown for clarity.
-Another very useful way of parameterizing this space is through polar
coordinates.
-For each point, we define a radius (@mymath{r}) and angle (@mymath{\phi}) from
a fixed (but arbitrary) reference axis.
-In @ref{flatplane} the infinitesimal changes for each polar coordinate are
plotted for a random point and a dashed circle is shown for all points with the
same radius.
+@cindex Probability density function
+Let's take @mymath{\lambda} to represent the expected mean count of something.
+Furthermore, let's take @mymath{k} to represent the output of a counting
attempt (hence @mymath{k} is a positive integer).
+The probability density function of getting @mymath{k} counts (in each
attempt, given the expected/mean count of @mymath{\lambda}) can be written as:
-@float Figure,flatplane
-@center@image{gnuastro-figures/flatplane, 10cm, , }
+@cindex Poisson distribution
+@dispmath{f(k)={\lambda^k \over k!} e^{-\lambda},\quad k\in @{0, 1, 2, 3,
\dots @}}
-@caption{Two dimensional Cartesian and polar coordinates on a flat plane.}
-@end float
+@cindex Skewed Poisson distribution
+Because the Poisson distribution is only applicable to positive integer values
(note the factorial operator, which only applies to non-negative integers),
naturally it is very skewed when @mymath{\lambda} is near zero.
+One qualitative way to understand this behavior is that for smaller values
near zero, there simply are not enough integers smaller than the mean, than
integers that are larger.
+Therefore to accommodate all possibilities/counts, it has to be strongly
skewed to the positive when the mean is small.
+For more on Skewness, see @ref{Skewness caused by signal and its measurement}.
-Assuming an object is placed at a certain position, which can be parameterized
as @mymath{(x,y)}, or @mymath{(r,\phi)}, a general infinitesimal change in its
position will place it in the coordinates @mymath{(x+dx,y+dy)}, or
@mymath{(r+dr,\phi+d\phi)}.
-The distance (on the flat 2D surface) that is covered by this infinitesimal
change in the static universe (@mymath{ds_s}, the subscript signifies the
static nature of this universe) can be written as:
+@cindex Compare Poisson and Gaussian
+As @mymath{\lambda} becomes larger, the distribution becomes more and more
symmetric, and the variance of that distribution is equal to its mean.
+In other words, the standard deviation is the square root of the mean.
+It can also be proved that when the mean is large, say @mymath{\lambda>1000},
the Poisson distribution approaches the
@url{https://en.wikipedia.org/wiki/Normal_distribution, Normal (Gaussian)
distribution} with mean @mymath{\mu=\lambda} and standard deviation
@mymath{\sigma=\sqrt{\lambda}}.
+In other words, a Poisson distribution (with a sufficiently large
@mymath{\lambda}) is simply a Gaussian that has one free parameter
(@mymath{\mu=\lambda} and @mymath{\sigma=\sqrt{\lambda}}), instead of the two
parameters that the Gaussian distribution originally has (independent
@mymath{\mu} and @mymath{\sigma}).
-@dispmath{ds_s^2=dx^2+dy^2=dr^2+r^2d\phi^2}
+@cindex Sky value
+@cindex Background flux
+@cindex Undetected objects
+In real situations, the photons/flux from our targets are combined with
photons from a certain background (observationally, the @emph{Sky} value).
+The Sky value is defined to be the average flux of a region in the dataset
with no targets.
+Its physical origin can be the brightness of the atmosphere (for ground-based
instruments), possible stray light within the imaging instrument, the average
flux of undetected targets, etc.
+The Sky value is thus an ideal definition, because in real datasets, what lies
deep in the noise (far lower than the detection limit) is never
known@footnote{In a real image, a relatively large number of very faint objects
can be fully buried in the noise and never detected.
+These undetected objects will bias the background measurement to slightly
larger values.
+Our best approximation is thus to simply assume they are uniform, and consider
their average effect.
+See Figure 1 (a.1 and a.2) and Section 2.2 in
@url{https://arxiv.org/abs/1505.01664, Akhlaghi and Ichikawa [2015]}.}.
+To account for all of these, the sky value is defined to be the average
count/value of the undetected regions in the image.
+In a mock image/dataset, we have the luxury of setting the background (Sky)
value.
-The main question is this: how can the 2D creature incorporate the (possible)
curvature in its universe when it's calculating distances? The universe that it
lives in might equally be a curved surface like @ref{sphereandplane}.
-The answer to this question but for a 3D being (us) is the whole purpose to
this discussion.
-Here, we want to give the 2D creature (and later, ourselves) the tools to
measure distances if the space (that hosts the objects) is curved.
+@cindex Simulating noise
+@cindex Noise simulation
+In summary, the value in each element of the dataset (pixel in an image) is
the sum of contributions from various galaxies and stars (after convolution by
the PSF, see @ref{PSF}).
+Let's name the convolved sum of possibly overlapping objects in each pixel as
@mymath{I_{nn}}.
+@mymath{nn} represents `no noise'.
+For now, let's assume the background (@mymath{B}) is constant and sufficiently
high for the Poisson distribution to be approximated by a Gaussian.
+Then the flux of that pixel, after adding noise, is @emph{a random value}
taken from a Gaussian distribution with the following mean (@mymath{\mu}) and
standard deviation (@mymath{\sigma}):
-@ref{sphereandplane} assumes a spherical shell with radius @mymath{R} as the
curved 2D plane for simplicity.
-The 2D plane is tangent to the spherical shell and only touches it at
@mymath{A}.
-This idea will be generalized later.
-The first step in measuring the distance in a curved space is to imagine a
third dimension along the @mymath{z} axis as shown in @ref{sphereandplane}.
-For simplicity, the @mymath{z} axis is assumed to pass through the center of
the spherical shell.
-Our imaginary 2D creature cannot visualize the third dimension or a curved 2D
surface within it, so the remainder of this discussion is purely abstract for
it (similar to us having difficulty in visualizing a 3D curved space in 4D).
-But since we are 3D creatures, we have the advantage of visualizing the
following steps.
-Fortunately the 2D creature is already familiar with our mathematical
constructs, so it can follow our reasoning.
+@dispmath{\mu=B+I_{nn}, \quad \sigma=\sqrt{B+I_{nn}}}
-With the third axis added, a generic infinitesimal change over @emph{the full}
3D space corresponds to the distance:
+@cindex Bias level in detectors
+@cindex Dark level in detectors
+In astronomical instruments, @mymath{B} is enhanced by adding a ``bias'' level
to each pixel before the shutter is even opened (for the exposure to start).
+As the exposure is ongoing and photo-electrons are accumulating from the
astronomical objects, a ``dark'' current (due to thermal radiation of the
instrument) also builds up in the pixels.
+The ``dark'' current will accumulate even when the shutter is closed, but the
CCD electronics are working (hence the name ``dark'').
+This added dark level further enhances the mean value in a real observation
compared to the raw background value (from the atmosphere for example).
-@dispmath{ds_s^2=dx^2+dy^2+dz^2=dr^2+r^2d\phi^2+dz^2.}
+Since this type of noise is inherent in the objects we study, it is usually
measured on the same scale as the astronomical objects, namely the magnitude
system, see @ref{Brightness flux magnitude}.
+It is then internally converted to the flux scale for further processing.
-@float Figure,sphereandplane
-@center@image{gnuastro-figures/sphereandplane, 10cm, , }
+The equations above clearly show the importance of the background value and
its effect on the final signal to noise ratio in each pixel of a science image.
+It is therefore, one of the most important factors in understanding the noise
(and properly simulating observations where necessary).
+An inappropriately bright background value can hide the signal of the mock
profile hide behind the noise.
+In other words, a brighter background has larger standard deviation and vice
versa.
+As a result, the only necessary parameter to define photon-counting noise over
a mock image of simulated profiles is the background.
+For a complete example, see @ref{Sufi simulates a detection}.
-@caption{2D spherical shell (centered on @mymath{O}) and flat plane (light
gray) tangent to it at point @mymath{A}.}
-@end float
+To better understand the correlation between the mean (or background) value
and the noise standard deviation, let's use an analogy.
+Consider the profile of your galaxy to be analogous to the profile of a ship
that is sailing in the sea.
+The height of the ship would therefore be analogous to the maximum flux
difference between your galaxy's minimum and maximum values.
+Furthermore, let's take the depth of the sea to represent the background
value: a deeper sea, corresponds to a brighter background.
+In this analogy, the ``noise'' would be the height of the waves that surround
the ship: in deeper waters, the waves would also be taller (the square root of
the mean depth at the ship's position).
-It is very important to recognize that this change of distance is for
@emph{any} point in the 3D space, not just those changes that occur on the 2D
spherical shell of @ref{sphereandplane}.
-Recall that our 2D friend can only do measurements on the 2D surfaces, not the
full 3D space.
-So we have to constrain this general change to any change on the 2D spherical
shell.
-To do that, let's look at the arbitrary point @mymath{P} on the 2D spherical
shell.
-Its image (@mymath{P'}) on the flat plain is also displayed. From the dark
gray triangle, we see that
+If the ship is in deep waters, the height of waves are greater than when the
ship is near to the beach (at lower depths).
+Therefore, when the ship is in the middle of the sea, there are high waves
that are capable of hiding a significant part of the ship from our perspective.
+This corresponds to a brighter background value in astronomical images: the
resulting noise from that brighter background can completely wash out the
signal from a fainter galaxy, star or solar system object.
-@dispmath{\sin\theta={r\over R},\quad\cos\theta={R-z\over R}.}These relations
allow the 2D creature to find the value of @mymath{z} (an abstract dimension
for it) as a function of r (distance on a flat 2D plane, which it can
visualize) and thus eliminate @mymath{z}.
-From @mymath{\sin^2\theta+\cos^2\theta=1}, we get @mymath{z^2-2Rz+r^2=0} and
solving for @mymath{z}, we find:
+@node Instrumental noise, Final noised pixel value, Photon counting noise,
Noise basics
+@subsubsection Instrumental noise
-@dispmath{z=R\left(1\pm\sqrt{1-{r^2\over R^2}}\right).}
+@cindex Readout noise
+@cindex Instrumental noise
+@cindex Noise, instrumental
+While taking images with a camera, a bias current is fed to the pixels, the
variation of the value of this bias current over the pixels, also adds to the
final image noise.
+Another source of noise is the readout noise that is produced by the
electronics in the detector.
+Specifically, the parts that attempt to digitize the voltage produced by the
photo-electrons in the analog to digital converter.
+With the current generation of instruments, this source of noise is not as
significant as the noise due to the background Sky discussed in @ref{Photon
counting noise}.
-The @mymath{\pm} can be understood from @ref{sphereandplane}: For each
@mymath{r}, there are two points on the sphere, one in the upper hemisphere and
one in the lower hemisphere.
-An infinitesimal change in @mymath{r}, will create the following infinitesimal
change in @mymath{z}:
+Let @mymath{C} represent the combined standard deviation of all these
instrumental sources of noise.
+When only this source of noise is present, the noised pixel value would be a
random value chosen from a Gaussian distribution with
-@dispmath{dz={\mp r\over R}\left(1\over
-\sqrt{1-{r^2/R^2}}\right)dr.}Using the positive signed equation instead of
@mymath{dz} in the @mymath{ds_s^2} equation above, we get:
+@dispmath{\mu=I_{nn}, \quad \sigma=\sqrt{C^2+I_{nn}}}
-@dispmath{ds_s^2={dr^2\over 1-r^2/R^2}+r^2d\phi^2.}
+@cindex ADU
+@cindex Gain
+@cindex Counts
+This type of noise is independent of the signal in the dataset, it is only
determined by the instrument.
+So the flux scale (and not magnitude scale) is most commonly used for this
type of noise.
+In practice, this value is usually reported in analog-to-digital units or
ADUs, not flux or electron counts.
+The gain value of the device can be used to convert between these two, see
@ref{Brightness flux magnitude}.
-The derivation above was done for a spherical shell of radius @mymath{R} as a
curved 2D surface.
-To generalize it to any surface, we can define @mymath{K=1/R^2} as the
curvature parameter.
-Then the general infinitesimal change in a static universe can be written as:
+@node Final noised pixel value, Generating random numbers, Instrumental noise,
Noise basics
+@subsubsection Final noised pixel value
+Based on the discussions in @ref{Photon counting noise} and @ref{Instrumental
noise}, depending on the values you specify for @mymath{B} and @mymath{C} from
the above, the final noised value for each pixel is a random value chosen from
a Gaussian distribution with
-@dispmath{ds_s^2={dr^2\over 1-Kr^2}+r^2d\phi^2.}
+@dispmath{\mu=B+I_{nn}, \quad \sigma=\sqrt{C^2+B+I_{nn}}}
-Therefore, when @mymath{K>0} (and curvature is the same everywhere), we have a
finite universe, where @mymath{r} cannot become larger than @mymath{R} as in
@ref{sphereandplane}.
-When @mymath{K=0}, we have a flat plane (@ref{flatplane}) and a negative
@mymath{K} will correspond to an imaginary @mymath{R}.
-The latter two cases may be infinite in area (which is not a simple concept,
but mathematically can be modeled with @mymath{r} extending infinitely), or
finite-area (like a cylinder is flat everywhere with @mymath{ds_s^2={dx^2 +
dy^2}}, but finite in one direction in size).
-@cindex Proper distance
-A very important issue that can be discussed now (while we are still in 2D and
can actually visualize things) is that @mymath{\overrightarrow{r}} is tangent
to the curved space at the observer's position.
-In other words, it is on the gray flat surface of @ref{sphereandplane}, even
when the universe if curved: @mymath{\overrightarrow{r}=P'-A}.
-Therefore for the point @mymath{P} on a curved space, the raw coordinate
@mymath{r} is the distance to @mymath{P'}, not @mymath{P}.
-The distance to the point @mymath{P} (at a specific coordinate @mymath{r} on
the flat plane) over the curved surface (thick line in @ref{sphereandplane}) is
called the @emph{proper distance} and is displayed with @mymath{l}.
-For the specific example of @ref{sphereandplane}, the proper distance can be
calculated with: @mymath{l=R\theta} (@mymath{\theta} is in radians).
-Using the @mymath{\sin\theta} relation found above, we can find @mymath{l} as
a function of @mymath{r}:
-@dispmath{\theta=\sin^{-1}\left({r\over R}\right)\quad\rightarrow\quad
-l(r)=R\sin^{-1}\left({r\over R}\right)}
+@node Generating random numbers, , Final noised pixel value, Noise basics
+@subsubsection Generating random numbers
+@cindex Random numbers
+@cindex Numbers, random
+As discussed above, to generate noise we need to make random samples of a
particular distribution.
+So it is important to understand some general concepts regarding the
generation of random numbers.
+For a very complete and nice introduction we strongly advise reading Donald
Knuth's ``The art of computer programming'', volume 2, chapter
3@footnote{Knuth, Donald. 1998.
+The art of computer programming. Addison--Wesley. ISBN 0-201-89684-2 }.
+Quoting from the GNU Scientific Library manual, ``If you do not own it, you
should stop reading right now, run to the nearest bookstore, and buy
it''@footnote{For students, running to the library might be more affordable!}!
-@mymath{R} is just an arbitrary constant and can be directly found from
@mymath{K}, so for cleaner equations, it is common practice to set
@mymath{R=1}, which gives: @mymath{l(r)=\sin^{-1}r}.
-Also note that when @mymath{R=1}, then @mymath{l=\theta}.
-Generally, depending on the curvature, in a @emph{static} universe the proper
distance can be written as a function of the coordinate @mymath{r} as (from now
on we are assuming @mymath{R=1}):
+@cindex Psuedo-random numbers
+@cindex Numbers, psuedo-random
+Using only software, we can only produce what is called a psuedo-random
sequence of numbers.
+A true random number generator is a hardware (let's assume we have made sure
it has no systematic biases), for example, throwing dice or flipping coins
(which have remained from the ancient times).
+More modern hardware methods use atmospheric noise, thermal noise or other
types of external electromagnetic or quantum phenomena.
+All pseudo-random number generators (software) require a seed to be the basis
of the generation.
+The advantage of having a seed is that if you specify the same seed for
multiple runs, you will get an identical sequence of random numbers which
allows you to reproduce the same final noised image.
-@dispmath{l(r)=\sin^{-1}(r)\quad(K>0),\quad\quad
-l(r)=r\quad(K=0),\quad\quad l(r)=\sinh^{-1}(r)\quad(K<0).}With
-@mymath{l}, the infinitesimal change of distance can be written in a
-more simpler and abstract form of
+@cindex Environment variables
+@cindex GNU Scientific Library
+The programs in GNU Astronomy Utilities (for example, MakeNoise or
MakeProfiles) use the GNU Scientific Library (GSL) to generate random numbers.
+GSL allows the user to set the random number generator through environment
variables, see @ref{Installation directory} for an introduction to environment
variables.
+In the chapter titled ``Random Number Generation'' they have fully explained
the various random number generators that are available (there are a lot of
them!).
+Through the two environment variables @code{GSL_RNG_TYPE} and
@code{GSL_RNG_SEED} you can specify the generator and its seed respectively.
-@dispmath{ds_s^2=dl^2+r^2d\phi^2.}
+@cindex Seed, Random number generator
+@cindex Random number generator, Seed
+If you do not specify a value for @code{GSL_RNG_TYPE}, GSL will use its
default random number generator type.
+The default type is sufficient for most general applications.
+If no value is given for the @code{GSL_RNG_SEED} environment variable and you
have asked Gnuastro to read the seed from the environment (through the
@option{--envseed} option), then GSL will use the default value of each
generator to give identical outputs.
+If you do not explicitly tell Gnuastro programs to read the seed value from
the environment variable, then they will use the system time (accurate to
within a microsecond) to generate (apparently random) seeds.
+In this manner, every time you run the program, you will get a different
random number distribution.
-@cindex Comoving distance
-Until now, we had assumed a static universe (not changing with time).
-But our observations so far appear to indicate that the universe is expanding
(it is not static).
-Since there is no reason to expect the observed expansion is unique to our
particular position of the universe, we expect the universe to be expanding at
all points with the same rate at the same time.
-Therefore, to add a time dependence to our distance measurements, we can
include a multiplicative scaling factor, which is a function of time:
@mymath{a(t)}.
-The functional form of @mymath{a(t)} comes from the cosmology, the physics we
assume for it: general relativity, and the choice of whether the universe is
uniform (`homogeneous') in density and curvature or inhomogeneous.
-In this section, the functional form of @mymath{a(t)} is irrelevant, so we can
avoid these issues.
+There are two ways you can specify values for these environment variables.
+You can call them on the same command-line for example:
-With this scaling factor, the proper distance will also depend on time.
-As the universe expands, the distance between two given points will shift to
larger values.
-We thus define a distance measure, or coordinate, that is independent of time
and thus does not `move'.
-We call it the @emph{comoving distance} and display with @mymath{\chi} such
that: @mymath{l(r,t)=\chi(r)a(t)}.
-We have therefore, shifted the @mymath{r} dependence of the proper distance we
derived above for a static universe to the comoving distance:
+@example
+$ GSL_RNG_TYPE="taus" GSL_RNG_SEED=345 astmknoise input.fits
+@end example
-@dispmath{\chi(r)=\sin^{-1}(r)\quad(K>0),\quad\quad
-\chi(r)=r\quad(K=0),\quad\quad \chi(r)=\sinh^{-1}(r)\quad(K<0).}
+@noindent
+In this manner the values will only be used for this particular execution of
MakeNoise.
+Alternatively, you can define them for the full period of your terminal
session or script length, using the shell's @command{export} command with the
two separate commands below (for a script remove the @code{$} signs):
-Therefore, @mymath{\chi(r)} is the proper distance to an object at a specific
reference time: @mymath{t=t_r} (the @mymath{r} subscript signifies
``reference'') when @mymath{a(t_r)=1}.
-At any arbitrary moment (@mymath{t\neq{t_r}}) before or after @mymath{t_r},
the proper distance to the object can be scaled with @mymath{a(t)}.
+@example
+$ export GSL_RNG_TYPE="taus"
+$ export GSL_RNG_SEED=345
+@end example
-Measuring the change of distance in a time-dependent (expanding) universe only
makes sense if we can add up space and time@footnote{In other words, making our
space-time consistent with Minkowski space-time geometry.
-In this geometry, different observers at a given point (event) in space-time
split up space-time into `space' and `time' in different ways, just like people
at the same spatial position can make different choices of splitting up a map
into `left--right' and `up--down'.
-This model is well supported by twentieth and twenty-first century
observations.}.
-But we can only add bits of space and time together if we measure them in the
same units: with a conversion constant (similar to how 1000 is used to convert
a kilometer into meters).
-Experimentally, we find strong support for the hypothesis that this conversion
constant is the speed of light (or gravitational waves@footnote{The speed of
gravitational waves was recently found to be very similar to that of light in
vacuum, see @url{https://arxiv.org/abs/1710.05834, arXiv:1710.05834}.}) in a
vacuum.
-This speed is postulated to be constant@footnote{In @emph{natural units},
speed is measured in units of the speed of light in vacuum.} and is almost
always written as @mymath{c}.
-We can thus parameterize the change in distance on an expanding 2D surface as
+@cindex Startup scripts
+@cindex @file{.bashrc}
+@noindent
+The subsequent programs which use GSL's random number generators will hence
forth use these values in this session of the terminal you are running or while
executing this script.
+In case you want to set fixed values for these parameters every time you use
the GSL random number generator, you can add these two lines to your
@file{.bashrc} startup script@footnote{Do Not forget that if you are going to
give your scripts (that use the GSL random number generator) to others you have
to make sure you also tell them to set these environment variable separately.
+So for scripts, it is best to keep all such variable definitions within the
script, even if they are within your @file{.bashrc}.}, see @ref{Installation
directory}.
-@dispmath{ds^2=c^2dt^2-a^2(t)ds_s^2 = c^2dt^2-a^2(t)(d\chi^2+r^2d\phi^2).}
+@strong{IMPORTANT NOTE:} If the two environment variables @code{GSL_RNG_TYPE}
and @code{GSL_RNG_SEED} are defined, GSL will report them by default, even if
you do not use the @option{--envseed} option.
+For example, see this call to MakeProfiles:
+@example
+$ export GSL_RNG_TYPE=taus
+$ export GSL_RNG_SEED=345
+$ astmkprof -s1 --kernel=gaussian,2,5
+GSL_RNG_TYPE=taus
+GSL_RNG_SEED=345
+MakeProfiles V.VV started on DDD MMM DDD HH:MM:SS YYYY
+ - Building one gaussian kernel
+ - Random number generator (RNG) type: taus
+ - Basic RNG seed: 1618960836
+ ---- ./kernel.fits created.
+ -- Output: ./kernel.fits
+MakeProfiles finished in 0.068945 seconds
+@end example
-@node Extending distance concepts to 3D, Invoking astcosmiccal, Distance on a
2D curved space, CosmicCalculator
-@subsection Extending distance concepts to 3D
+@noindent
+@cindex Seed, Random number generator
+@cindex Random number generator, Seed
+The first two output lines (showing the names and values of the GSL
environment variables) are printed by GSL before MakeProfiles actually starts
generating random numbers.
+Gnuastro's programs will report the actual values they use independently
(after the name of the program), you should check them for the final values
used, not GSL's printed values.
+In the example above, did you notice how the random number generator seed
above is different between GSL and MakeProfiles?
+However, if @option{--envseed} was given, both printed seeds would be the same.
-The concepts of @ref{Distance on a 2D curved space} are here extended to a 3D
space that @emph{might} be curved.
-We can start with the generic infinitesimal distance in a static 3D universe,
but this time in spherical coordinates instead of polar coordinates.
-@mymath{\theta} is shown in @ref{sphereandplane}, but here we are 3D beings,
positioned on @mymath{O} (the center of the sphere) and the point @mymath{O} is
tangent to a 4D-sphere.
-In our 3D space, a generic infinitesimal displacement will correspond to the
following distance in spherical coordinates:
-@dispmath{ds_s^2=dx^2+dy^2+dz^2=dr^2+r^2(d\theta^2+\sin^2{\theta}d\phi^2).}
+@node Invoking astmknoise, , Noise basics, MakeNoise
+@subsection Invoking MakeNoise
-Like the 2D creature before, we now have to assume an abstract dimension which
we cannot visualize easily.
-Let's call the fourth dimension @mymath{w}, then the general change in
coordinates in the @emph{full} four dimensional space will be:
+MakeNoise will add noise to an existing image.
+The executable name is @file{astmknoise} with the following general template
-@dispmath{ds_s^2=dr^2+r^2(d\theta^2+\sin^2{\theta}d\phi^2)+dw^2.}
+@example
+$ astmknoise [OPTION ...] InputImage.fits
+@end example
@noindent
-But we can only work on a 3D curved space, so following exactly the same steps
and conventions as our 2D friend, we arrive at:
+One line examples:
-@dispmath{ds_s^2={dr^2\over 1-Kr^2}+r^2(d\theta^2+\sin^2{\theta}d\phi^2).}
+@example
+## Add noise with a standard deviation of 100 to image.
+## (this is independent of the pixel value: not Poission noise)
+$ astmknoise --sigma=100 image.fits
+
+## Add noise to the input image assuming a per-pixel background
+## magnitude (with zero point magnitude of 0) and an
+## instrumental noise of 20.
+$ astmknoise --background=-10 -z0 --instrumental=20 mockimage.fits
+@end example
@noindent
-In a non-static universe (with a scale factor a(t)), the distance can be
written as:
+If actual processing is to be done, the input image is a mandatory argument.
+The full list of options common to all the programs in Gnuastro can be seen in
@ref{Common options}.
+The type (see @ref{Numeric data types}) of the output can be specified with
the @option{--type} option, see @ref{Input output options}.
+The header of the output FITS file keeps all the parameters that were
influential in making it.
+This is done for future reproducibility.
-@dispmath{ds^2=c^2dt^2-a^2(t)[d\chi^2+r^2(d\theta^2+\sin^2{\theta}d\phi^2)].}
+@table @option
+@item -b FLT
+@itemx --background=FLT
+The background value (per pixel) that will be added to each pixel value
(internally) to simulate Poisson noise, see @ref{Photon counting noise}.
+By default the units of this value are assumed to be in magnitudes, hence a
@option{--zeropoint} is also necessary.
+If the background is in units of counts, you need add
@option{--bgisbrightness}, see @ref{Brightness flux magnitude}.
+Internally, the value given to this option will be converted to counts
(@mymath{b}, when @option{--bgnotmag} is called, the value will be used
directly).
+Assuming the pixel value is @mymath{p}, the random value for that pixel will
be taken from a Gaussian distribution with mean of @mymath{p+b} and standard
deviation of @mymath{\sqrt{p+b}}.
+With this option, the noise will therefore be dependent on the pixel values:
according to the Poission noise model, as the pixel value becomes larger, its
noise will also become larger.
+This is thus a realistic way to model noise, see @ref{Photon counting noise}.
-@c@dispmath{H(z){\equiv}\left(\dot{a}\over a\right)(z)=H_0E(z) }
+@item -B
+@itemx --bgnotmag
+The value given to @option{--background} should not be interpreted as a
magnitude, but the raw pixel units (usually counts).
-@c@dispmath{E(z)=[ \Omega_{\Lambda,0} + \Omega_{C,0}(1+z)^2 +
-@c\Omega_{m,0}(1+z)^3 + \Omega_{r,0}(1+z)^4 ]^{1/2}}
+@item -z FLT
+@itemx --zeropoint=FLT
+The zero point magnitude used to convert the value of @option{--background}
(in units of magnitude) to flux, see @ref{Brightness flux magnitude}.
-@c Let's take @mymath{r} to be the radial coordinate of the emitting
-@c source, which emitted its light at redshift $z$. Then the comoving
-@c distance of this object would be:
+@item -i FLT
+@itemx --instrumental=FLT
+The instrumental noise which is in units of flux, see @ref{Instrumental noise}.
-@c@dispmath{ \chi(r)={c\over H_0a_0}\int_0^z{dz'\over E(z')} }
+@item -s FLT
+@item --sigma=FLT
+The total noise sigma in the same units as the pixel values.
+With this option, the @option{--background}, @option{--zeropoint} and
@option{--instrumental} will be ignored.
+With this option, the noise will be independent of the pixel values (which is
not realistic, see @ref{Photon counting noise}).
+Hence it is only useful if you are working on low surface brightness regions
where the change in pixel value (and thus real noise) is insignificant.
-@c@noindent
-@c So the proper distance at the current time to that object is:
-@c @mymath{a_0\chi(r)}, therefore the angular diameter distance
-@c (@mymath{d_A}) and luminosity distance (@mymath{d_L}) can be written
-@c as:
+Generally, @strong{usage of this option is discouraged} unless you understand
the risks of not simulating real noise.
+This is because with this option, you will not get Poisson noise (the common
noise model for astronomical imaging), where the noise varies based on pixel
value.
+Use @option{--background} for adding Poission noise.
-@c@dispmath{ d_A={a_0\chi(r)\over 1+z}, \quad d_L=a_0\chi(r)(1+z) }
+@item -e
+@itemx --envseed
+@cindex Seed, Random number generator
+@cindex Random number generator, Seed
+Use the @code{GSL_RNG_SEED} environment variable for the seed used in the
random number generator, see @ref{Generating random numbers}.
+With this option, the output image noise is always going to be identical (or
reproducible).
+@item -d
+@itemx --doubletype
+Save the output in the double precision floating point format that was used
internally.
+This option will be most useful if the input images were of integer types.
+@end table
-@node Invoking astcosmiccal, , Extending distance concepts to 3D,
CosmicCalculator
-@subsection Invoking CosmicCalculator
-CosmicCalculator will calculate cosmological variables based on the input
parameters.
-The executable name is @file{astcosmiccal} with the following general template
-@example
-$ astcosmiccal [OPTION...] ...
-@end example
-@noindent
-One line examples:
-@example
-## Print basic cosmological properties at redshift 2.5:
-$ astcosmiccal -z2.5
-## Only print Comoving volume over 4pi stradian to z (Mpc^3):
-$ astcosmiccal --redshift=0.8 --volume
-## Print redshift and age of universe when Lyman-alpha line is
-## at 6000 angstrom (another way to specify redshift).
-$ astcosmiccal --obsline=Ly-alpha,6000 --age
-## Print luminosity distance, angular diameter distance and age
-## of universe in one row at redshift 0.4
-$ astcosmiccal -z0.4 -LAg
-## Assume Lambda and matter density of 0.7 and 0.3 and print
-## basic cosmological parameters for redshift 2.1:
-$ astcosmiccal -l0.7 -m0.3 -z2.1
-## Print wavelength of all pre-defined spectral lines when
-## Lyman-alpha is observed at 4000 Angstroms.
-$ astcosmiccal --obsline=Ly-alpha,4000 --listlinesatz
-@end example
-The input parameters (current matter density, etc.) can be given as
command-line options or in the configuration files, see @ref{Configuration
files}.
-For a definition of the different parameters, please see the sections prior to
this.
-If no redshift is given, CosmicCalculator will just print its input parameters
and abort.
-For a full list of the input options, please see @ref{CosmicCalculator input
options}.
-Without any particular output requested (and only a given redshift),
CosmicCalculator will print all basic cosmological calculations (one per line)
with some explanations before each.
-This can be good when you want a general feeling of the conditions at a
specific redshift.
-Alternatively, if any specific calculation(s) are requested (its possible to
call more than one), only the requested value(s) will be calculated and printed
with one character space between them.
-In this case, no description or units will be printed.
-See @ref{CosmicCalculator basic cosmology calculations} for the full list of
these options along with some explanations how when/how they can be useful.
-Another common operation in observational cosmology is dealing with spectral
lines at different redshifts.
-CosmicCalculator also has features to help in such situations, please see
@ref{CosmicCalculator spectral line calculations}.
-@menu
-* CosmicCalculator input options:: Options to specify input conditions.
-* CosmicCalculator basic cosmology calculations:: Such as distance modulus
and distances.
-* CosmicCalculator spectral line calculations:: How they get affected by
redshift.
-@end menu
-@node CosmicCalculator input options, CosmicCalculator basic cosmology
calculations, Invoking astcosmiccal, Invoking astcosmiccal
-@subsubsection CosmicCalculator input options
-The inputs to CosmicCalculator can be specified with the following options:
-@table @option
+@node High-level calculations, Installed scripts, Data modeling, Top
+@chapter High-level calculations
-@item -z FLT
-@itemx --redshift=FLT
-The redshift of interest.
-There are two other ways that you can specify the target redshift:
-1) Spectral lines and their observed wavelengths, see @option{--obsline}.
-2) Velocity, see @option{--velocity}.
-Hence this option cannot be called with @option{--obsline} or
@option{--velocity}.
+After the reduction of raw data (for example, with the programs in @ref{Data
manipulation}) you will have reduced images/data ready for processing/analyzing
(for example, with the programs in @ref{Data analysis}).
+But the processed/analyzed data (or catalogs) are still not enough to derive
any scientific result.
+Even higher-level analysis is still needed to convert the observed magnitudes,
sizes or volumes into physical quantities that we associate with each catalog
entry or detected object which is the purpose of the tools in this section.
-@item -y FLT
-@itemx --velocity=FLT
-Input velocity in km/s.
-The given value will be converted to redshift internally, and used in any
subsequent calculation.
-This option is thus an alternative to @code{--redshift} or @code{--obsline},
it cannot be used with them.
-The conversion will be done with the more general and accurate relativistic
equation of @mymath{1+z=\sqrt{(c+v)/(c-v)}}, not the simplified
@mymath{z\approx v/c}.
-@item -H FLT
-@itemx --H0=FLT
-Current expansion rate (in km sec@mymath{^{-1}} Mpc@mymath{^{-1}}).
-@item -l FLT
-@itemx --olambda=FLT
-Cosmological constant density divided by the critical density in the current
Universe (@mymath{\Omega_{\Lambda,0}}).
-@item -m FLT
-@itemx --omatter=FLT
-Matter (including massive neutrinos) density divided by the critical density
in the current Universe (@mymath{\Omega_{m,0}}).
-@item -r FLT
-@itemx --oradiation=FLT
-Radiation density divided by the critical density in the current Universe
(@mymath{\Omega_{r,0}}).
+@menu
+* CosmicCalculator:: Calculate cosmological variables
+@end menu
-@item -O STR/FLT,FLT
-@itemx --obsline=STR/FLT,FLT
-@cindex Rest-frame wavelength
-@cindex Wavelength, rest-frame
-Find the redshift to use in next steps based on the rest-frame and observed
wavelengths of a line.
-This option is thus an alternative to @code{--redshift} or @code{--velocity},
it cannot be used with them.
+@node CosmicCalculator, , High-level calculations, High-level calculations
+@section CosmicCalculator
-The first argument identifies the line.
-It can be one of the standard names, or any rest-frame wavelength in Angstroms.
-The second argument is the observed wavelength of that line.
-For example, @option{--obsline=Ly-alpha,6000} is the same as
@option{--obsline=1215.64,6000}.
-Wavelengths are assumed to be in Angstroms by default (other units can be
selected with @option{--lineunit}, see @ref{CosmicCalculator spectral line
calculations}).
+To derive higher-level information regarding our sources in extra-galactic
astronomy, cosmological calculations are necessary.
+In Gnuastro, CosmicCalculator is in charge of such calculations.
+Before discussing how CosmicCalculator is called and operates (in
@ref{Invoking astcosmiccal}), it is important to provide a rough but mostly
self sufficient review of the basics and the equations used in the analysis.
+In @ref{Distance on a 2D curved space} the basic idea of understanding
distances in a curved and expanding 2D universe (which we can visualize) are
reviewed.
+Having solidified the concepts there, in @ref{Extending distance concepts to
3D}, the formalism is extended to the 3D universe we are trying to study in our
research.
-The list of pre-defined names for the lines in Gnuastro's database is
available by running
+The focus here is obtaining a physical insight into these equations (mainly
for the use in real observational studies).
+There are many books thoroughly deriving and proving all the equations with
all possible initial conditions and assumptions for any abstract universe,
interested readers can study those books.
-@example
-$ astcosmiccal --listlines
-@end example
-@end table
+@menu
+* Distance on a 2D curved space:: Distances in 2D for simplicity.
+* Extending distance concepts to 3D:: Going to 3D (our real universe).
+* Invoking astcosmiccal:: How to run CosmicCalculator.
+@end menu
+@node Distance on a 2D curved space, Extending distance concepts to 3D,
CosmicCalculator, CosmicCalculator
+@subsection Distance on a 2D curved space
+The observations to date (for example, the Planck 2015 results), have not
measured@footnote{The observations are interpreted under the assumption of
uniform curvature.
+For a relativistic alternative to dark energy (and maybe also some part of
dark matter), non-uniform curvature may be even be more critical, but that is
beyond the scope of this brief explanation.} the presence of significant
curvature in the universe.
+However to be generic (and allow its measurement if it does in fact exist), it
is very important to create a framework that allows non-zero uniform curvature.
+However, this section is not intended to be a fully thorough and
mathematically complete derivation of these concepts.
+There are many references available for such reviews that go deep into the
abstract mathematical proofs.
+The emphasis here is on visualization of the concepts for a beginner.
-@node CosmicCalculator basic cosmology calculations, CosmicCalculator spectral
line calculations, CosmicCalculator input options, Invoking astcosmiccal
-@subsubsection CosmicCalculator basic cosmology calculations
-By default, when no specific calculations are requested, CosmicCalculator will
print a complete set of all its calculators (one line for each calculation, see
@ref{Invoking astcosmiccal}).
-The full list of calculations can be useful when you do not want any specific
value, but just a general view.
-In other contexts (for example, in a batch script or during a discussion), you
know exactly what you want and do not want to be distracted by all the extra
information.
+As 3D beings, it is difficult for us to mentally create (visualize) a picture
of the curvature of a 3D volume.
+Hence, here we will assume a 2D surface/space and discuss distances on that 2D
surface when it is flat and when it is curved.
+Once the concepts have been created/visualized here, we will extend them, in
@ref{Extending distance concepts to 3D}, to a real 3D spatial @emph{slice} of
the Universe we live in and hope to study.
-You can use any number of the options described below in any order.
-When any of these options are requested, CosmicCalculator's output will just
be a single line with a single space between the (possibly) multiple values.
-In the example below, only the tangential distance along one arc-second (in
kpc), absolute magnitude conversion, and age of the universe at redshift 2 are
printed (recall that you can merge short options together, see @ref{Options}).
+To be more understandable (actively discuss from an observer's point of view)
let's assume there's an imaginary 2D creature living on the 2D space (which
@emph{might} be curved in 3D).
+Here, we will be working with this creature in its efforts to analyze
distances in its 2D universe.
+The start of the analysis might seem too mundane, but since it is difficult to
imagine a 3D curved space, it is important to review all the very basic
concepts thoroughly for an easy transition to a universe that is more difficult
to visualize (a curved 3D space embedded in 4D).
-@example
-$ astcosmiccal -z2 -sag
-8.585046 44.819248 3.289979
-@end example
+To start, let's assume a static (not expanding or shrinking), flat 2D surface
similar to @ref{flatplane} and that the 2D creature is observing its universe
from point @mymath{A}.
+One of the most basic ways to parameterize this space is through the Cartesian
coordinates (@mymath{x}, @mymath{y}).
+In @ref{flatplane}, the basic axes of these two coordinates are plotted.
+An infinitesimal change in the direction of each axis is written as
@mymath{dx} and @mymath{dy}.
+For each point, the infinitesimal changes are parallel with the respective
axes and are not shown for clarity.
+Another very useful way of parameterizing this space is through polar
coordinates.
+For each point, we define a radius (@mymath{r}) and angle (@mymath{\phi}) from
a fixed (but arbitrary) reference axis.
+In @ref{flatplane} the infinitesimal changes for each polar coordinate are
plotted for a random point and a dashed circle is shown for all points with the
same radius.
-Here is one example of using this feature in scripts: by adding the following
two lines in a script to keep/use the comoving volume with varying redshifts:
+@float Figure,flatplane
+@center@image{gnuastro-figures/flatplane, 10cm, , }
-@example
-z=3.12
-vol=$(astcosmiccal --redshift=$z --volume)
-@end example
+@caption{Two dimensional Cartesian and polar coordinates on a flat plane.}
+@end float
-@cindex GNU Grep
-@noindent
-In a script, this operation might be necessary for a large number of objects
(several of galaxies in a catalog for example).
-So the fact that all the other default calculations are ignored will also help
you get to your result faster.
+Assuming an object is placed at a certain position, which can be parameterized
as @mymath{(x,y)}, or @mymath{(r,\phi)}, a general infinitesimal change in its
position will place it in the coordinates @mymath{(x+dx,y+dy)}, or
@mymath{(r+dr,\phi+d\phi)}.
+The distance (on the flat 2D surface) that is covered by this infinitesimal
change in the static universe (@mymath{ds_s}, the subscript signifies the
static nature of this universe) can be written as:
-If you are indeed dealing with many (for example, thousands) of redshifts,
using CosmicCalculator is not the best/fastest solution.
-Because it has to go through all the configuration files and preparations for
each invocation.
-To get the best efficiency (least overhead), we recommend using Gnuastro's
cosmology library (see @ref{Cosmology library}).
-CosmicCalculator also calls the library functions defined there for its
calculations, so you get the same result with no overhead.
-Gnuastro also has libraries for easily reading tables into a C program, see
@ref{Table input output}.
-Afterwards, you can easily build and run your C program for the particular
processing with @ref{BuildProgram}.
+@dispmath{ds_s^2=dx^2+dy^2=dr^2+r^2d\phi^2}
-If you just want to inspect the value of a variable visually, the description
(which comes with units) might be more useful.
-In such cases, the following command might be better.
-The other calculations will also be done, but they are so fast that you will
not notice on modern computers (the time it takes your eye to focus on the
result is usually longer than the processing: a fraction of a second).
+The main question is this: how can the 2D creature incorporate the (possible)
curvature in its universe when it's calculating distances? The universe that it
lives in might equally be a curved surface like @ref{sphereandplane}.
+The answer to this question but for a 3D being (us) is the whole purpose to
this discussion.
+Here, we want to give the 2D creature (and later, ourselves) the tools to
measure distances if the space (that hosts the objects) is curved.
-@example
-$ astcosmiccal --redshift=0.832 | grep volume
-@end example
+@ref{sphereandplane} assumes a spherical shell with radius @mymath{R} as the
curved 2D plane for simplicity.
+The 2D plane is tangent to the spherical shell and only touches it at
@mymath{A}.
+This idea will be generalized later.
+The first step in measuring the distance in a curved space is to imagine a
third dimension along the @mymath{z} axis as shown in @ref{sphereandplane}.
+For simplicity, the @mymath{z} axis is assumed to pass through the center of
the spherical shell.
+Our imaginary 2D creature cannot visualize the third dimension or a curved 2D
surface within it, so the remainder of this discussion is purely abstract for
it (similar to us having difficulty in visualizing a 3D curved space in 4D).
+But since we are 3D creatures, we have the advantage of visualizing the
following steps.
+Fortunately the 2D creature is already familiar with our mathematical
constructs, so it can follow our reasoning.
-The full list of CosmicCalculator's specific calculations is present below in
two groups: basic cosmology calculations and those related to spectral lines.
-In case you have forgot the units, you can use the @option{--help} option
which has the units along with a short description.
+With the third axis added, a generic infinitesimal change over @emph{the full}
3D space corresponds to the distance:
-@table @option
+@dispmath{ds_s^2=dx^2+dy^2+dz^2=dr^2+r^2d\phi^2+dz^2.}
-@item -e
-@itemx --usedredshift
-The redshift that was used in this run.
-In many cases this is the main input parameter to CosmicCalculator, but it is
useful in others.
-For example, in combination with @option{--obsline} (where you give an
observed and rest-frame wavelength and would like to know the redshift) or with
@option{--velocity} (where you specify the velocity instead of redshift).
-Another example is when you run CosmicCalculator in a loop, while changing the
redshift and you want to keep the redshift value with the resulting calculation.
+@float Figure,sphereandplane
+@center@image{gnuastro-figures/sphereandplane, 10cm, , }
-@item -Y
-@itemx --usedvelocity
-The velocity (in km/s) that was used in this run.
-The conversion from redshift will be done with the more general and accurate
relativistic equation of @mymath{1+z=\sqrt{(c+v)/(c-v)}}, not the simplified
@mymath{z\approx v/c}.
+@caption{2D spherical shell (centered on @mymath{O}) and flat plane (light
gray) tangent to it at point @mymath{A}.}
+@end float
-@item -G
-@itemx --agenow
-The current age of the universe (given the input parameters) in Ga (Giga
annum, or billion years).
+It is very important to recognize that this change of distance is for
@emph{any} point in the 3D space, not just those changes that occur on the 2D
spherical shell of @ref{sphereandplane}.
+Recall that our 2D friend can only do measurements on the 2D surfaces, not the
full 3D space.
+So we have to constrain this general change to any change on the 2D spherical
shell.
+To do that, let's look at the arbitrary point @mymath{P} on the 2D spherical
shell.
+Its image (@mymath{P'}) on the flat plain is also displayed. From the dark
gray triangle, we see that
-@item -C
-@itemx --criticaldensitynow
-The current critical density (given the input parameters) in grams per
centimeter-cube (@mymath{g/cm^3}).
+@dispmath{\sin\theta={r\over R},\quad\cos\theta={R-z\over R}.}These relations
allow the 2D creature to find the value of @mymath{z} (an abstract dimension
for it) as a function of r (distance on a flat 2D plane, which it can
visualize) and thus eliminate @mymath{z}.
+From @mymath{\sin^2\theta+\cos^2\theta=1}, we get @mymath{z^2-2Rz+r^2=0} and
solving for @mymath{z}, we find:
-@item -d
-@itemx --properdistance
-The proper distance (at current time) to object at the given redshift in
Megaparsecs (Mpc).
-See @ref{Distance on a 2D curved space} for a description of the proper
distance.
+@dispmath{z=R\left(1\pm\sqrt{1-{r^2\over R^2}}\right).}
-@item -A
-@itemx --angulardimdist
-The angular diameter distance to object at given redshift in Megaparsecs (Mpc).
+The @mymath{\pm} can be understood from @ref{sphereandplane}: For each
@mymath{r}, there are two points on the sphere, one in the upper hemisphere and
one in the lower hemisphere.
+An infinitesimal change in @mymath{r}, will create the following infinitesimal
change in @mymath{z}:
-@item -s
-@itemx --arcsectandist
-The tangential distance covered by 1 arc-seconds at the given redshift in
kiloparsecs (Kpc).
-This can be useful when trying to estimate the resolution or pixel scale of an
instrument (usually in units of arc-seconds) at a given redshift.
+@dispmath{dz={\mp r\over R}\left(1\over
+\sqrt{1-{r^2/R^2}}\right)dr.}Using the positive signed equation instead of
@mymath{dz} in the @mymath{ds_s^2} equation above, we get:
-@item -L
-@itemx --luminositydist
-The luminosity distance to object at given redshift in Megaparsecs (Mpc).
+@dispmath{ds_s^2={dr^2\over 1-r^2/R^2}+r^2d\phi^2.}
-@item -u
-@itemx --distancemodulus
-The distance modulus at given redshift.
+The derivation above was done for a spherical shell of radius @mymath{R} as a
curved 2D surface.
+To generalize it to any surface, we can define @mymath{K=1/R^2} as the
curvature parameter.
+Then the general infinitesimal change in a static universe can be written as:
-@item -a
-@itemx --absmagconv
-The conversion factor (addition) to absolute magnitude.
-Note that this is practically the distance modulus added with
@mymath{-2.5\log{(1+z)}} for the desired redshift based on the input parameters.
-Once the apparent magnitude and redshift of an object is known, this value may
be added with the apparent magnitude to give the object's absolute magnitude.
+@dispmath{ds_s^2={dr^2\over 1-Kr^2}+r^2d\phi^2.}
-@item -g
-@itemx --age
-Age of the universe at given redshift in Ga (Giga annum, or billion years).
+Therefore, when @mymath{K>0} (and curvature is the same everywhere), we have a
finite universe, where @mymath{r} cannot become larger than @mymath{R} as in
@ref{sphereandplane}.
+When @mymath{K=0}, we have a flat plane (@ref{flatplane}) and a negative
@mymath{K} will correspond to an imaginary @mymath{R}.
+The latter two cases may be infinite in area (which is not a simple concept,
but mathematically can be modeled with @mymath{r} extending infinitely), or
finite-area (like a cylinder is flat everywhere with @mymath{ds_s^2={dx^2 +
dy^2}}, but finite in one direction in size).
-@item -b
-@itemx --lookbacktime
-The look-back time to given redshift in Ga (Giga annum, or billion years).
-The look-back time at a given redshift is defined as the current age of the
universe (@option{--agenow}) subtracted by the age of the universe at the given
redshift.
+@cindex Proper distance
+A very important issue that can be discussed now (while we are still in 2D and
can actually visualize things) is that @mymath{\overrightarrow{r}} is tangent
to the curved space at the observer's position.
+In other words, it is on the gray flat surface of @ref{sphereandplane}, even
when the universe if curved: @mymath{\overrightarrow{r}=P'-A}.
+Therefore for the point @mymath{P} on a curved space, the raw coordinate
@mymath{r} is the distance to @mymath{P'}, not @mymath{P}.
+The distance to the point @mymath{P} (at a specific coordinate @mymath{r} on
the flat plane) over the curved surface (thick line in @ref{sphereandplane}) is
called the @emph{proper distance} and is displayed with @mymath{l}.
+For the specific example of @ref{sphereandplane}, the proper distance can be
calculated with: @mymath{l=R\theta} (@mymath{\theta} is in radians).
+Using the @mymath{\sin\theta} relation found above, we can find @mymath{l} as
a function of @mymath{r}:
-@item -c
-@itemx --criticaldensity
-The critical density at given redshift in grams per centimeter-cube
(@mymath{g/cm^3}).
+@dispmath{\theta=\sin^{-1}\left({r\over R}\right)\quad\rightarrow\quad
+l(r)=R\sin^{-1}\left({r\over R}\right)}
-@item -v
-@itemx --onlyvolume
-The comoving volume in Megaparsecs cube (Mpc@mymath{^3}) until the desired
redshift based on the input parameters.
-@end table
+@mymath{R} is just an arbitrary constant and can be directly found from
@mymath{K}, so for cleaner equations, it is common practice to set
@mymath{R=1}, which gives: @mymath{l(r)=\sin^{-1}r}.
+Also note that when @mymath{R=1}, then @mymath{l=\theta}.
+Generally, depending on the curvature, in a @emph{static} universe the proper
distance can be written as a function of the coordinate @mymath{r} as (from now
on we are assuming @mymath{R=1}):
+@dispmath{l(r)=\sin^{-1}(r)\quad(K>0),\quad\quad
+l(r)=r\quad(K=0),\quad\quad l(r)=\sinh^{-1}(r)\quad(K<0).}With
+@mymath{l}, the infinitesimal change of distance can be written in a
+more simpler and abstract form of
+@dispmath{ds_s^2=dl^2+r^2d\phi^2.}
+@cindex Comoving distance
+Until now, we had assumed a static universe (not changing with time).
+But our observations so far appear to indicate that the universe is expanding
(it is not static).
+Since there is no reason to expect the observed expansion is unique to our
particular position of the universe, we expect the universe to be expanding at
all points with the same rate at the same time.
+Therefore, to add a time dependence to our distance measurements, we can
include a multiplicative scaling factor, which is a function of time:
@mymath{a(t)}.
+The functional form of @mymath{a(t)} comes from the cosmology, the physics we
assume for it: general relativity, and the choice of whether the universe is
uniform (`homogeneous') in density and curvature or inhomogeneous.
+In this section, the functional form of @mymath{a(t)} is irrelevant, so we can
avoid these issues.
-@node CosmicCalculator spectral line calculations, , CosmicCalculator basic
cosmology calculations, Invoking astcosmiccal
-@subsubsection CosmicCalculator spectral line calculations
+With this scaling factor, the proper distance will also depend on time.
+As the universe expands, the distance between two given points will shift to
larger values.
+We thus define a distance measure, or coordinate, that is independent of time
and thus does not `move'.
+We call it the @emph{comoving distance} and display with @mymath{\chi} such
that: @mymath{l(r,t)=\chi(r)a(t)}.
+We have therefore, shifted the @mymath{r} dependence of the proper distance we
derived above for a static universe to the comoving distance:
-@cindex Rest frame wavelength
-At different redshifts, observed spectral lines are shifted compared to their
rest frame wavelengths with this simple relation:
@mymath{\lambda_{obs}=\lambda_{rest}(1+z)}.
-Although this relation is very simple and can be done for one line in the head
(or a simple calculator!), it slowly becomes tiring when dealing with a lot of
lines or redshifts, or some precision is necessary.
-The options in this section are thus provided to greatly simplify usage of
this simple equation, and also helping by storing a list of pre-defined
spectral line wavelengths.
+@dispmath{\chi(r)=\sin^{-1}(r)\quad(K>0),\quad\quad
+\chi(r)=r\quad(K=0),\quad\quad \chi(r)=\sinh^{-1}(r)\quad(K<0).}
-For example, if you want to know the wavelength of the @mymath{H\alpha} line
(at 6562.8 Angstroms in rest frame), when @mymath{Ly\alpha} is at 8000
Angstroms, you can call CosmicCalculator like the first example below.
-And if you want the wavelength of all pre-defined spectral lines at this
redshift, you can use the second command.
+Therefore, @mymath{\chi(r)} is the proper distance to an object at a specific
reference time: @mymath{t=t_r} (the @mymath{r} subscript signifies
``reference'') when @mymath{a(t_r)=1}.
+At any arbitrary moment (@mymath{t\neq{t_r}}) before or after @mymath{t_r},
the proper distance to the object can be scaled with @mymath{a(t)}.
-@example
-$ astcosmiccal --obsline=lyalpha,8000 --lineatz=halpha
-$ astcosmiccal --obsline=lyalpha,8000 --listlinesatz
-@end example
+Measuring the change of distance in a time-dependent (expanding) universe only
makes sense if we can add up space and time@footnote{In other words, making our
space-time consistent with Minkowski space-time geometry.
+In this geometry, different observers at a given point (event) in space-time
split up space-time into `space' and `time' in different ways, just like people
at the same spatial position can make different choices of splitting up a map
into `left--right' and `up--down'.
+This model is well supported by twentieth and twenty-first century
observations.}.
+But we can only add bits of space and time together if we measure them in the
same units: with a conversion constant (similar to how 1000 is used to convert
a kilometer into meters).
+Experimentally, we find strong support for the hypothesis that this conversion
constant is the speed of light (or gravitational waves@footnote{The speed of
gravitational waves was recently found to be very similar to that of light in
vacuum, see @url{https://arxiv.org/abs/1710.05834, arXiv:1710.05834}.}) in a
vacuum.
+This speed is postulated to be constant@footnote{In @emph{natural units},
speed is measured in units of the speed of light in vacuum.} and is almost
always written as @mymath{c}.
+We can thus parameterize the change in distance on an expanding 2D surface as
-Bellow you can see the printed/output calculations of CosmicCalculator that
are related to spectral lines.
-Note that @option{--obsline} is an input parameter, so it is discussed (with
the full list of known lines) in @ref{CosmicCalculator input options}.
+@dispmath{ds^2=c^2dt^2-a^2(t)ds_s^2 = c^2dt^2-a^2(t)(d\chi^2+r^2d\phi^2).}
-@table @option
-@item --listlines
-List the pre-defined rest frame spectral line wavelengths and their names on
standard output, then abort CosmicCalculator.
-The units of the displayed wavelengths for each line can be determined with
@option{--lineunit} (see below).
+@node Extending distance concepts to 3D, Invoking astcosmiccal, Distance on a
2D curved space, CosmicCalculator
+@subsection Extending distance concepts to 3D
-When this option is given, other operations on the command-line will be
ignored.
-This is convenient when you forget the specific name of the spectral line used
within Gnuastro, or when you forget the exact wavelength of a certain line.
+The concepts of @ref{Distance on a 2D curved space} are here extended to a 3D
space that @emph{might} be curved.
+We can start with the generic infinitesimal distance in a static 3D universe,
but this time in spherical coordinates instead of polar coordinates.
+@mymath{\theta} is shown in @ref{sphereandplane}, but here we are 3D beings,
positioned on @mymath{O} (the center of the sphere) and the point @mymath{O} is
tangent to a 4D-sphere.
+In our 3D space, a generic infinitesimal displacement will correspond to the
following distance in spherical coordinates:
-These names can be used with the options that deal with spectral lines, for
example, @option{--obsline} and @option{--lineatz} (@ref{CosmicCalculator basic
cosmology calculations}).
+@dispmath{ds_s^2=dx^2+dy^2+dz^2=dr^2+r^2(d\theta^2+\sin^2{\theta}d\phi^2).}
-The format of the output list is a two-column table, with Gnuastro's text
table format (see @ref{Gnuastro text table format}).
-Therefore, if you are only looking for lines in a specific range, you can pipe
the output into Gnuastro's table program and use its @option{--range} option on
the @code{wavelength} (first) column.
-For example, if you only want to see the lines between 4000 and 6000
Angstroms, you can run this command:
+Like the 2D creature before, we now have to assume an abstract dimension which
we cannot visualize easily.
+Let's call the fourth dimension @mymath{w}, then the general change in
coordinates in the @emph{full} four dimensional space will be:
-@example
-$ astcosmiccal --listlines \
- | asttable --range=wavelength,4000,6000
-@end example
+@dispmath{ds_s^2=dr^2+r^2(d\theta^2+\sin^2{\theta}d\phi^2)+dw^2.}
@noindent
-And if you want to use the list later and have it as a table in a file, you
can easily add the @option{--output} (or @option{-o}) option to the
@command{asttable} command, and specify the filename, for example,
@option{--output=lines.fits} or @option{--output=lines.txt}.
-
-@item --listlinesatz
-Similar to @option{--listlines} (above), but the printed wavelength is not in
the rest frame, but redshifted to the given redshift.
-Recall that the redshift can be specified by @option{--redshift} directly or
by @option{--obsline}, see @ref{CosmicCalculator input options}.
-For an example usage of this option, see @ref{Viewing spectra and redshifted
lines}.
+But we can only work on a 3D curved space, so following exactly the same steps
and conventions as our 2D friend, we arrive at:
-@item -i STR/FLT
-@itemx --lineatz=STR/FLT
-The wavelength of the specified line at the redshift given to CosmicCalculator.
-The line can be specified either by its name or directly as a number (its
wavelength).
-The units of the displayed wavelengths for each line can be determined with
@option{--lineunit} (see below).
+@dispmath{ds_s^2={dr^2\over 1-Kr^2}+r^2(d\theta^2+\sin^2{\theta}d\phi^2).}
-To get the list of pre-defined names for the lines and their wavelength, you
can use the @option{--listlines} option, see @ref{CosmicCalculator input
options}.
-In the former case (when a name is given), the returned number is in units of
Angstroms.
-In the latter (when a number is given), the returned value is the same units
of the input number (assuming it is a wavelength).
+@noindent
+In a non-static universe (with a scale factor a(t)), the distance can be
written as:
-@item --lineunit=STR
-The units to display line wavelengths above.
-It can take the following four values.
-If you need any other unit, please contact us at @code{bug-gnuastro@@gnu.org}.
+@dispmath{ds^2=c^2dt^2-a^2(t)[d\chi^2+r^2(d\theta^2+\sin^2{\theta}d\phi^2)].}
-@table @code
-@item m
-Meter.
-@item micro-m
-Micrometer or @mymath{10^{-6}m}.
-@item nano-m
-Nanometer, or @mymath{10^{-9}m}.
-@item angstrom
-Angstrom or @mymath{10^{-10}m}; the default unit when this option is not
called.
-@end table
-@end table
+@c@dispmath{H(z){\equiv}\left(\dot{a}\over a\right)(z)=H_0E(z) }
+@c@dispmath{E(z)=[ \Omega_{\Lambda,0} + \Omega_{C,0}(1+z)^2 +
+@c\Omega_{m,0}(1+z)^3 + \Omega_{r,0}(1+z)^4 ]^{1/2}}
+@c Let's take @mymath{r} to be the radial coordinate of the emitting
+@c source, which emitted its light at redshift $z$. Then the comoving
+@c distance of this object would be:
+@c@dispmath{ \chi(r)={c\over H_0a_0}\int_0^z{dz'\over E(z')} }
+@c@noindent
+@c So the proper distance at the current time to that object is:
+@c @mymath{a_0\chi(r)}, therefore the angular diameter distance
+@c (@mymath{d_A}) and luminosity distance (@mymath{d_L}) can be written
+@c as:
+@c@dispmath{ d_A={a_0\chi(r)\over 1+z}, \quad d_L=a_0\chi(r)(1+z) }
-@node Installed scripts, Makefile extensions, High-level calculations, Top
-@chapter Installed scripts
-Gnuastro's programs (introduced in previous chapters) are designed to be
highly modular and thus contain lower-level operations on the data.
-However, in many contexts, certain higher-level are also shared between many
contexts.
-For example, a sequence of calls to multiple Gnuastro programs, or a special
way of running a program and treating the output.
-To facilitate such higher-level data analysis, Gnuastro also installs some
scripts on your system with the (@code{astscript-}) prefix (in contrast to the
other programs that only have the @code{ast} prefix).
+@node Invoking astcosmiccal, , Extending distance concepts to 3D,
CosmicCalculator
+@subsection Invoking CosmicCalculator
-@cindex GNU Bash
-@cindex Portable shell
-@cindex Shell, portable
-Like all of Gnuastro's source code, these scripts are also heavily commented.
-They are written in portable shell scripts (command-line environments), which
does not need compilation.
-Therefore, if you open the installed scripts in a text editor, you can
actually read them@footnote{Gnuastro's installed programs (those only starting
with @code{ast}) are not human-readable.
-They are written in C and need to be compiled before execution.
-Compilation optimizes the steps into the low-level hardware CPU
instructions/language to improve efficiency.
-Because compiled programs do not need an interpreter like Bash on every run,
they are much faster and more independent than scripts.
-To read the source code of the programs, look into the @file{bin/progname}
directory of Gnuastro's source (@ref{Downloading the source}).
-If you would like to read more about why C was chosen for the programs, please
see @ref{Why C}.}.
-For example, with this command (just replace @code{nano} with your favorite
text editor, like @command{emacs} or @command{vim}):
+CosmicCalculator will calculate cosmological variables based on the input
parameters.
+The executable name is @file{astcosmiccal} with the following general template
@example
-$ nano $(which astscript-NAME)
+$ astcosmiccal [OPTION...] ...
@end example
-Shell scripting is the same language that you use when typing on the
command-line.
-Therefore shell scripting is much more widely known and used compared to C
(the language of other Gnuastro programs).
-Because Gnuastro's installed scripts do higher-level operations, customizing
these scripts for a special project will be more common than the programs.
-These scripts also accept options and are in many ways similar to the programs
(see @ref{Common options}) with some minor differences:
+@noindent
+One line examples:
-@itemize
-@item
-Currently they do not accept configuration files themselves.
-However, the configuration files of the Gnuastro programs they call are indeed
parsed and used by those programs.
+@example
+## Print basic cosmological properties at redshift 2.5:
+$ astcosmiccal -z2.5
-As a result, they do not have the following options: @option{--checkconfig},
@option{--config}, @option{--lastconfig}, @option{--onlyversion},
@option{--printparams}, @option{--setdirconf} and @option{--setusrconf}.
+## Only print Comoving volume over 4pi stradian to z (Mpc^3):
+$ astcosmiccal --redshift=0.8 --volume
-@item
-They do not directly allocate any memory, so there is no @option{--minmapsize}.
+## Print redshift and age of universe when Lyman-alpha line is
+## at 6000 angstrom (another way to specify redshift).
+$ astcosmiccal --obsline=Ly-alpha,6000 --age
-@item
-They do not have an independent @option{--usage} option: when called with
@option{--usage}, they just recommend running @option{--help}.
+## Print luminosity distance, angular diameter distance and age
+## of universe in one row at redshift 0.4
+$ astcosmiccal -z0.4 -LAg
-@item
-The output of @option{--help} is not configurable like the programs (see
@ref{--help}).
+## Assume Lambda and matter density of 0.7 and 0.3 and print
+## basic cosmological parameters for redshift 2.1:
+$ astcosmiccal -l0.7 -m0.3 -z2.1
-@item
-@cindex GNU AWK
-@cindex GNU SED
-The scripts will commonly use your installed shell and other basic
command-line tools (for example, AWK or SED).
-Different systems have different versions and implementations of these basic
tools (for example, GNU/Linux systems use GNU Bash, GNU AWK and GNU SED which
are far more advanced and up to date then the minimalist AWK and SED of most
other systems).
-Therefore, unexpected errors in these tools might come up when you run these
scripts on non-GNU/Linux operating systems.
-If you do confront such strange errors, please submit a bug report so we fix
it as soon as possible (see @ref{Report a bug}).
+## Print wavelength of all pre-defined spectral lines when
+## Lyman-alpha is observed at 4000 Angstroms.
+$ astcosmiccal --obsline=Ly-alpha,4000 --listlinesatz
+@end example
+
+The input parameters (current matter density, etc.) can be given as
command-line options or in the configuration files, see @ref{Configuration
files}.
+For a definition of the different parameters, please see the sections prior to
this.
+If no redshift is given, CosmicCalculator will just print its input parameters
and abort.
+For a full list of the input options, please see @ref{CosmicCalculator input
options}.
+
+Without any particular output requested (and only a given redshift),
CosmicCalculator will print all basic cosmological calculations (one per line)
with some explanations before each.
+This can be good when you want a general feeling of the conditions at a
specific redshift.
+Alternatively, if any specific calculation(s) are requested (its possible to
call more than one), only the requested value(s) will be calculated and printed
with one character space between them.
+In this case, no description or units will be printed.
+See @ref{CosmicCalculator basic cosmology calculations} for the full list of
these options along with some explanations how when/how they can be useful.
-@end itemize
+Another common operation in observational cosmology is dealing with spectral
lines at different redshifts.
+CosmicCalculator also has features to help in such situations, please see
@ref{CosmicCalculator spectral line calculations}.
@menu
-* Sort FITS files by night:: Sort many files by date.
-* Generate radial profile:: Radial profile of an object in an image.
-* SAO DS9 region files from table:: Create ds9 region file from a table.
-* Viewing FITS file contents with DS9 or TOPCAT:: Open DS9 (images/cubes) or
TOPCAT (tables).
-* Zero point estimation:: Zero point of an image from reference catalog
or image(s).
-* Dithering pattern simulation:: Simulate a stack with a certain dithering
pattern.
-* PSF construction and subtraction:: Set of scripts to create extended PSF of
an image.
+* CosmicCalculator input options:: Options to specify input conditions.
+* CosmicCalculator basic cosmology calculations:: Such as distance modulus
and distances.
+* CosmicCalculator spectral line calculations:: How they get affected by
redshift.
@end menu
-@node Sort FITS files by night, Generate radial profile, Installed scripts,
Installed scripts
-@section Sort FITS files by night
+@node CosmicCalculator input options, CosmicCalculator basic cosmology
calculations, Invoking astcosmiccal, Invoking astcosmiccal
+@subsubsection CosmicCalculator input options
-@cindex Calendar
-FITS images usually contain (several) keywords for preserving important dates.
-In particular, for lower-level data, this is usually the observation date and
time (for example, stored in the @code{DATE-OBS} keyword value).
-When analyzing observed datasets, many calibration steps (like the dark, bias
or flat-field), are commonly calculated on a per-observing-night basis.
+The inputs to CosmicCalculator can be specified with the following options:
+@table @option
-However, the FITS standard's date format (@code{YYYY-MM-DDThh:mm:ss.ddd}) is
based on the western (Gregorian) calendar.
-Dates that are stored in this format are complicated for automatic processing:
a night starts in the final hours of one calendar day, and extends to the early
hours of the next calendar day.
-As a result, to identify datasets from one night, we commonly need to search
for two dates.
-However calendar peculiarities can make this identification very difficult.
-For example, when an observation is done on the night separating two months
(like the night starting on March 31st and going into April 1st), or two years
(like the night starting on December 31st 2018 and going into January 1st,
2019).
-To account for such situations, it is necessary to keep track of how many days
are in a month, and leap years, etc.
+@item -z FLT
+@itemx --redshift=FLT
+The redshift of interest.
+There are two other ways that you can specify the target redshift:
+1) Spectral lines and their observed wavelengths, see @option{--obsline}.
+2) Velocity, see @option{--velocity}.
+Hence this option cannot be called with @option{--obsline} or
@option{--velocity}.
-@cindex Unix epoch time
-@cindex Time, Unix epoch
-@cindex Epoch, Unix time
-Gnuastro's @file{astscript-sort-by-night} script is created to help in such
important scenarios.
-It uses @ref{Fits} to convert the FITS date format into the Unix epoch time
(number of seconds since 00:00:00 of January 1st, 1970), using the
@option{--datetosec} option.
-The Unix epoch time is a single number (integer, if not given in sub-second
precision), enabling easy comparison and sorting of dates after January 1st,
1970.
+@item -y FLT
+@itemx --velocity=FLT
+Input velocity in km/s.
+The given value will be converted to redshift internally, and used in any
subsequent calculation.
+This option is thus an alternative to @code{--redshift} or @code{--obsline},
it cannot be used with them.
+The conversion will be done with the more general and accurate relativistic
equation of @mymath{1+z=\sqrt{(c+v)/(c-v)}}, not the simplified
@mymath{z\approx v/c}.
-You can use this script as a basis for making a much more highly customized
sorting script.
-Here are some examples
+@item -H FLT
+@itemx --H0=FLT
+Current expansion rate (in km sec@mymath{^{-1}} Mpc@mymath{^{-1}}).
-@itemize
-@item
-If you need to copy the files, but only need a single extension (not the whole
file), you can add a step just before the making of the symbolic links, or
copies, and change it to only copy a certain extension of the FITS file using
the Fits program's @option{--copy} option, see @ref{HDU information and
manipulation}.
+@item -l FLT
+@itemx --olambda=FLT
+Cosmological constant density divided by the critical density in the current
Universe (@mymath{\Omega_{\Lambda,0}}).
-@item
-If you need to classify the files with finer detail (for example, the purpose
of the dataset), you can add a step just before the making of the symbolic
links, or copies, to specify a file-name prefix based on other certain keyword
values in the files.
-For example, when the FITS files have a keyword to specify if the dataset is a
science, bias, or flat-field image.
-You can read it and to add a @code{sci-}, @code{bias-}, or @code{flat-} to the
created file (after the @option{--prefix}) automatically.
+@item -m FLT
+@itemx --omatter=FLT
+Matter (including massive neutrinos) density divided by the critical density
in the current Universe (@mymath{\Omega_{m,0}}).
-For example, let's assume the observing mode is stored in the hypothetical
@code{MODE} keyword, which can have three values of @code{BIAS-IMAGE},
@code{SCIENCE-IMAGE} and @code{FLAT-EXP}.
-With the step below, you can generate a mode-prefix, and add it to the
generated link/copy names (just correct the filename and extension of the first
line to the script's variables):
+@item -r FLT
+@itemx --oradiation=FLT
+Radiation density divided by the critical density in the current Universe
(@mymath{\Omega_{r,0}}).
+
+@item -O STR/FLT,FLT
+@itemx --obsline=STR/FLT,FLT
+@cindex Rest-frame wavelength
+@cindex Wavelength, rest-frame
+Find the redshift to use in next steps based on the rest-frame and observed
wavelengths of a line.
+This option is thus an alternative to @code{--redshift} or @code{--velocity},
it cannot be used with them.
+
+The first argument identifies the line.
+It can be one of the standard names, or any rest-frame wavelength in Angstroms.
+The second argument is the observed wavelength of that line.
+For example, @option{--obsline=Ly-alpha,6000} is the same as
@option{--obsline=1215.64,6000}.
+Wavelengths are assumed to be in Angstroms by default (other units can be
selected with @option{--lineunit}, see @ref{CosmicCalculator spectral line
calculations}).
+
+The list of pre-defined names for the lines in Gnuastro's database is
available by running
@example
-modepref=$(astfits infile.fits -h1 \
- | sed -e"s/'/ /g" \
- | awk '$1=="MODE"@{ \
- if($3=="BIAS-IMAGE") print "bias-"; \
- else if($3=="SCIENCE-IMAGE") print "sci-"; \
- else if($3==FLAT-EXP) print "flat-"; \
- else print $3, "NOT recognized"; exit 1@}')
+$ astcosmiccal --listlines
@end example
+@end table
-@cindex GNU AWK
-@cindex GNU Sed
-Here is a description of it.
-We first use @command{astfits} to print all the keywords in extension @code{1}
of @file{infile.fits}.
-In the FITS standard, string values (that we are assuming here) are placed in
single quotes (@key{'}) which are annoying in this context/use-case.
-Therefore, we pipe the output of @command{astfits} into @command{sed} to
remove all such quotes (substituting them with a blank space).
-The result is then piped to AWK for giving us the final mode-prefix: with
@code{$1=="MODE"}, we ask AWK to only consider the line where the first column
is @code{MODE}.
-There is an equal sign between the key name and value, so the value is the
third column (@code{$3} in AWK).
-We thus use a simple @code{if-else} structure to look into this value and
print our custom prefix based on it.
-The output of AWK is then stored in the @code{modepref} shell variable which
you can add to the link/copy name.
-With the solution above, the increment of the file counter for each night will
be independent of the mode.
-If you want the counter to be mode-dependent, you can add a different counter
for each mode and use that counter instead of the generic counter for each
night (based on the value of @code{modepref}).
-But we will leave the implementation of this step to you as an exercise.
-@end itemize
+@node CosmicCalculator basic cosmology calculations, CosmicCalculator spectral
line calculations, CosmicCalculator input options, Invoking astcosmiccal
+@subsubsection CosmicCalculator basic cosmology calculations
+By default, when no specific calculations are requested, CosmicCalculator will
print a complete set of all its calculators (one line for each calculation, see
@ref{Invoking astcosmiccal}).
+The full list of calculations can be useful when you do not want any specific
value, but just a general view.
+In other contexts (for example, in a batch script or during a discussion), you
know exactly what you want and do not want to be distracted by all the extra
information.
-@menu
-* Invoking astscript-sort-by-night:: Inputs and outputs to this script.
-@end menu
+You can use any number of the options described below in any order.
+When any of these options are requested, CosmicCalculator's output will just
be a single line with a single space between the (possibly) multiple values.
+In the example below, only the tangential distance along one arc-second (in
kpc), absolute magnitude conversion, and age of the universe at redshift 2 are
printed (recall that you can merge short options together, see @ref{Options}).
-@node Invoking astscript-sort-by-night, , Sort FITS files by night, Sort FITS
files by night
-@subsection Invoking astscript-sort-by-night
+@example
+$ astcosmiccal -z2 -sag
+8.585046 44.819248 3.289979
+@end example
-This installed script will read a FITS date formatted value from the given
keyword, and classify the input FITS files into individual nights.
-For more on installed scripts please see (see @ref{Installed scripts}).
-This script can be used with the following general template:
+Here is one example of using this feature in scripts: by adding the following
two lines in a script to keep/use the comoving volume with varying redshifts:
@example
-$ astscript-sort-by-night [OPTION...] FITS-files
+z=3.12
+vol=$(astcosmiccal --redshift=$z --volume)
@end example
+@cindex GNU Grep
@noindent
-One line examples:
+In a script, this operation might be necessary for a large number of objects
(several of galaxies in a catalog for example).
+So the fact that all the other default calculations are ignored will also help
you get to your result faster.
-@example
-## Use the DATE-OBS keyword
-$ astscript-sort-by-night --key=DATE-OBS /path/to/data/*.fits
+If you are indeed dealing with many (for example, thousands) of redshifts,
using CosmicCalculator is not the best/fastest solution.
+Because it has to go through all the configuration files and preparations for
each invocation.
+To get the best efficiency (least overhead), we recommend using Gnuastro's
cosmology library (see @ref{Cosmology library}).
+CosmicCalculator also calls the library functions defined there for its
calculations, so you get the same result with no overhead.
+Gnuastro also has libraries for easily reading tables into a C program, see
@ref{Table input output}.
+Afterwards, you can easily build and run your C program for the particular
processing with @ref{BuildProgram}.
-## Make links to the input files with the `img-' prefix
-$ astscript-sort-by-night --link --prefix=img- /path/to/data/*.fits
+If you just want to inspect the value of a variable visually, the description
(which comes with units) might be more useful.
+In such cases, the following command might be better.
+The other calculations will also be done, but they are so fast that you will
not notice on modern computers (the time it takes your eye to focus on the
result is usually longer than the processing: a fraction of a second).
+
+@example
+$ astcosmiccal --redshift=0.832 | grep volume
@end example
-This script will look into a HDU/extension (@option{--hdu}) for a keyword
(@option{--key}) in the given FITS files and interpret the value as a date.
-The inputs will be separated by "night"s (11:00a.m to next day's 10:59:59a.m,
spanning two calendar days, exact hour can be set with @option{--hour}).
+The full list of CosmicCalculator's specific calculations is present below in
two groups: basic cosmology calculations and those related to spectral lines.
+In case you have forgot the units, you can use the @option{--help} option
which has the units along with a short description.
-The default output is a list of all the input files along with the following
two columns: night number and file number in that night (sorted by time).
-With @option{--link} a symbolic link will be made (one for each input) that
contains the night number, and number of file in that night (sorted by time),
see the description of @option{--link} for more.
-When @option{--copy} is used instead of a link, a copy of the inputs will be
made instead of symbolic link.
+@table @option
-Below you can see one example where all the @file{target-*.fits} files in the
@file{data} directory should be separated by observing night according to the
@code{DATE-OBS} keyword value in their second extension (number @code{1},
recall that HDU counting starts from 0).
-You can see the output after the @code{ls} command.
+@item -e
+@itemx --usedredshift
+The redshift that was used in this run.
+In many cases this is the main input parameter to CosmicCalculator, but it is
useful in others.
+For example, in combination with @option{--obsline} (where you give an
observed and rest-frame wavelength and would like to know the redshift) or with
@option{--velocity} (where you specify the velocity instead of redshift).
+Another example is when you run CosmicCalculator in a loop, while changing the
redshift and you want to keep the redshift value with the resulting calculation.
-@example
-$ astscript-sort-by-night -pimg- -h1 -kDATE-OBS data/target-*.fits
-$ ls
-img-n1-1.fits img-n1-2.fits img-n2-1.fits ...
-@end example
+@item -Y
+@itemx --usedvelocity
+The velocity (in km/s) that was used in this run.
+The conversion from redshift will be done with the more general and accurate
relativistic equation of @mymath{1+z=\sqrt{(c+v)/(c-v)}}, not the simplified
@mymath{z\approx v/c}.
-The outputs can be placed in a different (already existing) directory by
including that directory's name in the @option{--prefix} value, for example,
@option{--prefix=sorted/img-} will put them all under the @file{sorted}
directory.
+@item -G
+@itemx --agenow
+The current age of the universe (given the input parameters) in Ga (Giga
annum, or billion years).
-This script can be configured like all Gnuastro's programs (through
command-line options, see @ref{Common options}), with some minor differences
that are described in @ref{Installed scripts}.
-The particular options to this script are listed below:
+@item -C
+@itemx --criticaldensitynow
+The current critical density (given the input parameters) in grams per
centimeter-cube (@mymath{g/cm^3}).
-@table @option
-@item -h STR
-@itemx --hdu=STR
-The HDU/extension to use in all the given FITS files.
-All of the given FITS files must have this extension.
+@item -d
+@itemx --properdistance
+The proper distance (at current time) to object at the given redshift in
Megaparsecs (Mpc).
+See @ref{Distance on a 2D curved space} for a description of the proper
distance.
-@item -k STR
-@itemx --key=STR
-The keyword name that contains the FITS date format to classify/sort by.
+@item -A
+@itemx --angulardimdist
+The angular diameter distance to object at given redshift in Megaparsecs (Mpc).
-@item -H FLT
-@itemx --hour=FLT
-The hour that defines the next ``night''.
-By default, all times before 11:00a.m are considered to belong to the previous
calendar night.
-If a sub-hour value is necessary, it should be given in units of hours, for
example, @option{--hour=9.5} corresponds to 9:30a.m.
+@item -s
+@itemx --arcsectandist
+The tangential distance covered by 1 arc-seconds at the given redshift in
kiloparsecs (Kpc).
+This can be useful when trying to estimate the resolution or pixel scale of an
instrument (usually in units of arc-seconds) at a given redshift.
-@cartouche
-@noindent
-@cindex Time zone
-@cindex UTC (Universal time coordinate)
-@cindex Universal time coordinate (UTC)
-@strong{Dealing with time zones:}
-The time that is recorded in @option{--key} may be in UTC (Universal Time
Coordinate).
-However, the organization of the images taken during the night depends on the
local time.
-It is possible to take this into account by setting the @option{--hour} option
to the local time in UTC.
+@item -L
+@itemx --luminositydist
+The luminosity distance to object at given redshift in Megaparsecs (Mpc).
+
+@item -u
+@itemx --distancemodulus
+The distance modulus at given redshift.
+
+@item -a
+@itemx --absmagconv
+The conversion factor (addition) to absolute magnitude.
+Note that this is practically the distance modulus added with
@mymath{-2.5\log{(1+z)}} for the desired redshift based on the input parameters.
+Once the apparent magnitude and redshift of an object is known, this value may
be added with the apparent magnitude to give the object's absolute magnitude.
+
+@item -g
+@itemx --age
+Age of the universe at given redshift in Ga (Giga annum, or billion years).
+
+@item -b
+@itemx --lookbacktime
+The look-back time to given redshift in Ga (Giga annum, or billion years).
+The look-back time at a given redshift is defined as the current age of the
universe (@option{--agenow}) subtracted by the age of the universe at the given
redshift.
+
+@item -c
+@itemx --criticaldensity
+The critical density at given redshift in grams per centimeter-cube
(@mymath{g/cm^3}).
+
+@item -v
+@itemx --onlyvolume
+The comoving volume in Megaparsecs cube (Mpc@mymath{^3}) until the desired
redshift based on the input parameters.
+
+@end table
-For example, consider a set of images taken in Auckland (New Zealand, UTC+12)
during different nights.
-If you want to classify these images by night, you have to know at which time
(in UTC time) the Sun rises (or any other separator/definition of a different
night).
-For example, if your observing night finishes before 9:00a.m in Auckland, you
can use @option{--hour=21}.
-Because in Auckland the local time of 9:00 corresponds to 21:00 UTC.
-@end cartouche
-@item -l
-@itemx --link
-Create a symbolic link for each input FITS file.
-This option cannot be used with @option{--copy}.
-The link will have a standard name in the following format (variable parts are
written in @code{CAPITAL} letters and described after it):
-@example
-PnN-I.fits
-@end example
-@table @code
-@item P
-This is the value given to @option{--prefix}.
-By default, its value is @code{./} (to store the links in the directory this
script was run in).
-See the description of @code{--prefix} for more.
-@item N
-This is the night-counter: starting from 1.
-@code{N} is just incremented by 1 for the next night, no matter how many
nights (without any dataset) there are between two subsequent observing nights
(its just an identifier for each night which you can easily map to different
calendar nights).
-@item I
-File counter in that night, sorted by time.
-@end table
+@node CosmicCalculator spectral line calculations, , CosmicCalculator basic
cosmology calculations, Invoking astcosmiccal
+@subsubsection CosmicCalculator spectral line calculations
-@item -c
-@itemx --copy
-Make a copy of each input FITS file with the standard naming convention
described in @option{--link}.
-With this option, instead of making a link, a copy is made.
-This option cannot be used with @option{--link}.
+@cindex Rest frame wavelength
+At different redshifts, observed spectral lines are shifted compared to their
rest frame wavelengths with this simple relation:
@mymath{\lambda_{obs}=\lambda_{rest}(1+z)}.
+Although this relation is very simple and can be done for one line in the head
(or a simple calculator!), it slowly becomes tiring when dealing with a lot of
lines or redshifts, or some precision is necessary.
+The options in this section are thus provided to greatly simplify usage of
this simple equation, and also helping by storing a list of pre-defined
spectral line wavelengths.
-@item -p STR
-@itemx --prefix=STR
-Prefix to append before the night-identifier of each newly created link or
copy.
-This option is thus only relevant with the @option{--copy} or @option{--link}
options.
-See the description of @option{--link} for how it is used.
-For example, with @option{--prefix=img-}, all the created file names in the
current directory will start with @code{img-}, making outputs like
@file{img-n1-1.fits} or @file{img-n3-42.fits}.
+For example, if you want to know the wavelength of the @mymath{H\alpha} line
(at 6562.8 Angstroms in rest frame), when @mymath{Ly\alpha} is at 8000
Angstroms, you can call CosmicCalculator like the first example below.
+And if you want the wavelength of all pre-defined spectral lines at this
redshift, you can use the second command.
-@option{--prefix} can also be used to store the links/copies in another
directory relative to the directory this script is being run (it must already
exist).
-For example, @code{--prefix=/path/to/processing/img-} will put all the
links/copies in the @file{/path/to/processing} directory, and the files (in
that directory) will all start with @file{img-}.
+@example
+$ astcosmiccal --obsline=lyalpha,8000 --lineatz=halpha
+$ astcosmiccal --obsline=lyalpha,8000 --listlinesatz
+@end example
-@item --stdintimeout=INT
-Number of micro-seconds to wait for standard input within this script.
-This does not correspond to general inputs into the script, inputs to the
script should always be given as a file.
-However, within the script, pipes are often used to pass the output of one
program to another.
-The value given to this option will be passed to those internal pipes.
-When running this script, if you confront an error, saying ``No input!'', you
should be able to fix it by giving a larger number to this option (the default
value is 10000000 micro-seconds or 10 seconds).
-@end table
+Bellow you can see the printed/output calculations of CosmicCalculator that
are related to spectral lines.
+Note that @option{--obsline} is an input parameter, so it is discussed (with
the full list of known lines) in @ref{CosmicCalculator input options}.
+@table @option
+@item --listlines
+List the pre-defined rest frame spectral line wavelengths and their names on
standard output, then abort CosmicCalculator.
+The units of the displayed wavelengths for each line can be determined with
@option{--lineunit} (see below).
+When this option is given, other operations on the command-line will be
ignored.
+This is convenient when you forget the specific name of the spectral line used
within Gnuastro, or when you forget the exact wavelength of a certain line.
+These names can be used with the options that deal with spectral lines, for
example, @option{--obsline} and @option{--lineatz} (@ref{CosmicCalculator basic
cosmology calculations}).
+The format of the output list is a two-column table, with Gnuastro's text
table format (see @ref{Gnuastro text table format}).
+Therefore, if you are only looking for lines in a specific range, you can pipe
the output into Gnuastro's table program and use its @option{--range} option on
the @code{wavelength} (first) column.
+For example, if you only want to see the lines between 4000 and 6000
Angstroms, you can run this command:
+@example
+$ astcosmiccal --listlines \
+ | asttable --range=wavelength,4000,6000
+@end example
+@noindent
+And if you want to use the list later and have it as a table in a file, you
can easily add the @option{--output} (or @option{-o}) option to the
@command{asttable} command, and specify the filename, for example,
@option{--output=lines.fits} or @option{--output=lines.txt}.
+@item --listlinesatz
+Similar to @option{--listlines} (above), but the printed wavelength is not in
the rest frame, but redshifted to the given redshift.
+Recall that the redshift can be specified by @option{--redshift} directly or
by @option{--obsline}, see @ref{CosmicCalculator input options}.
+For an example usage of this option, see @ref{Viewing spectra and redshifted
lines}.
+@item -i STR/FLT
+@itemx --lineatz=STR/FLT
+The wavelength of the specified line at the redshift given to CosmicCalculator.
+The line can be specified either by its name or directly as a number (its
wavelength).
+The units of the displayed wavelengths for each line can be determined with
@option{--lineunit} (see below).
+To get the list of pre-defined names for the lines and their wavelength, you
can use the @option{--listlines} option, see @ref{CosmicCalculator input
options}.
+In the former case (when a name is given), the returned number is in units of
Angstroms.
+In the latter (when a number is given), the returned value is the same units
of the input number (assuming it is a wavelength).
+@item --lineunit=STR
+The units to display line wavelengths above.
+It can take the following four values.
+If you need any other unit, please contact us at @code{bug-gnuastro@@gnu.org}.
+@table @code
+@item m
+Meter.
+@item micro-m
+Micrometer or @mymath{10^{-6}m}.
+@item nano-m
+Nanometer, or @mymath{10^{-9}m}.
+@item angstrom
+Angstrom or @mymath{10^{-10}m}; the default unit when this option is not
called.
+@end table
+@end table
-@node Generate radial profile, SAO DS9 region files from table, Sort FITS
files by night, Installed scripts
-@section Generate radial profile
-@cindex Radial profile
-@cindex Profile, profile
-The 1 dimensional radial profile of an object is an important parameter in
many aspects of astronomical image processing.
-For example, you want to study how the light of a galaxy is distributed as a
function of the radial distance from the center.
-In other cases, the radial profile of a star can show the PSF (see @ref{PSF}).
-Gnuastro's @file{astscript-radial-profile} script is created to obtain such
radial profiles for one object within an image.
-This script uses @ref{MakeProfiles} to generate elliptical apertures with the
values equal to the distance from the center of the object and
@ref{MakeCatalog} for measuring the values over the apertures.
-@menu
-* Invoking astscript-radial-profile:: How to call astscript-radial-profile
-@end menu
+@node Installed scripts, Makefile extensions, High-level calculations, Top
+@chapter Installed scripts
-@node Invoking astscript-radial-profile, , Generate radial profile, Generate
radial profile
-@subsection Invoking astscript-radial-profile
+Gnuastro's programs (introduced in previous chapters) are designed to be
highly modular and thus contain lower-level operations on the data.
+However, in many contexts, certain higher-level are also shared between many
contexts.
+For example, a sequence of calls to multiple Gnuastro programs, or a special
way of running a program and treating the output.
+To facilitate such higher-level data analysis, Gnuastro also installs some
scripts on your system with the (@code{astscript-}) prefix (in contrast to the
other programs that only have the @code{ast} prefix).
-This installed script will measure the radial profile of an object within an
image.
-For more on installed scripts please see (see @ref{Installed scripts}).
-This script can be used with the following general template:
+@cindex GNU Bash
+@cindex Portable shell
+@cindex Shell, portable
+Like all of Gnuastro's source code, these scripts are also heavily commented.
+They are written in portable shell scripts (command-line environments), which
does not need compilation.
+Therefore, if you open the installed scripts in a text editor, you can
actually read them@footnote{Gnuastro's installed programs (those only starting
with @code{ast}) are not human-readable.
+They are written in C and need to be compiled before execution.
+Compilation optimizes the steps into the low-level hardware CPU
instructions/language to improve efficiency.
+Because compiled programs do not need an interpreter like Bash on every run,
they are much faster and more independent than scripts.
+To read the source code of the programs, look into the @file{bin/progname}
directory of Gnuastro's source (@ref{Downloading the source}).
+If you would like to read more about why C was chosen for the programs, please
see @ref{Why C}.}.
+For example, with this command (just replace @code{nano} with your favorite
text editor, like @command{emacs} or @command{vim}):
@example
-$ astscript-radial-profile [OPTION...] FITS-file
+$ nano $(which astscript-NAME)
@end example
-@noindent
-Examples:
+Shell scripting is the same language that you use when typing on the
command-line.
+Therefore shell scripting is much more widely known and used compared to C
(the language of other Gnuastro programs).
+Because Gnuastro's installed scripts do higher-level operations, customizing
these scripts for a special project will be more common than the programs.
-@example
-## Generate the radial profile with default options (assuming the
-## object is in the center of the image, and using the mean).
-$ astscript-radial-profile image.fits
+These scripts also accept options and are in many ways similar to the programs
(see @ref{Common options}) with some minor differences:
-## Generate the radial profile centered at x=44 and y=37 (in pixels),
-## up to a radial distance of 19 pixels, use the mean value.
-$ astscript-radial-profile image.fits --center=44,37 --rmax=19
+@itemize
+@item
+Currently they do not accept configuration files themselves.
+However, the configuration files of the Gnuastro programs they call are indeed
parsed and used by those programs.
-## Generate the radial profile centered at x=44 and y=37 (in pixels),
-## up to a radial distance of 100 pixels, compute sigma clipped
-## mean and standard deviation (sigclip-mean and sigclip-std) using
-## 5 sigma and 0.1 tolerance (default is 3 sigma and 0.2 tolerance).
-$ astscript-radial-profile image.fits --center=44,37 --rmax=100 \
- --sigmaclip=5,0.1 \
- --measure=sigclip-mean,sigclip-std
+As a result, they do not have the following options: @option{--checkconfig},
@option{--config}, @option{--lastconfig}, @option{--onlyversion},
@option{--printparams}, @option{--setdirconf} and @option{--setusrconf}.
-## Generate the radial profile centered at RA=20.53751695,
-## DEC=0.9454292263, up to a radial distance of 88 pixels,
-## axis ratio equal to 0.32, and position angle of 148 deg.
-## Name the output table as `radial-profile.fits'
-$ astscript-radial-profile image.fits --mode=wcs \
- --center=20.53751695,0.9454292263 \
- --rmax=88 --axis-ratio=0.32 \
- --position-angle=148 -oradial-profile.fits
+@item
+They do not directly allocate any memory, so there is no @option{--minmapsize}.
-## Generate the radial profile centered at RA=40.062675270971,
-## DEC=-8.1511992735126, up to a radial distance of 20 pixels,
-## and calculate the SNR using the INPUT-NO-SKY and SKY-STD
-## extensions of the NoiseChisel output file.
-$ astscript-radial-profile image_detected.fits -hINPUT-NO-SKY \
- --mode=wcs --measure=sn \
- --center=40.062675270971,-8.1511992735126 \
- --rmax=20 --stdhdu=SKY_STD
+@item
+They do not have an independent @option{--usage} option: when called with
@option{--usage}, they just recommend running @option{--help}.
-## Generate the radial profile centered at RA=40.062675270971,
-## DEC=-8.1511992735126, up to a radial distance of 20 pixels,
-## and compute the SNR with a fixed value for std, std=10.
-$ astscript-radial-profile image.fits -h1 --mode=wcs --rmax=20 \
- --center=40.062675270971,-8.1511992735126 \
- --measure=sn --instd=10
+@item
+The output of @option{--help} is not configurable like the programs (see
@ref{--help}).
-## Generate the radial profile centered at X=1201, Y=1201 pixels, up
-## to a radial distance of 20 pixels and compute the median and the
-## SNR using the first extension of sky-std.fits as the dataset for std
-## values.
-$ astscript-radial-profile image.fits -h1 --mode=img --rmax=20 \
- --center=1201,1201 --measure=median,sn \
- --instd=sky-std.fits
-@end example
+@item
+@cindex GNU AWK
+@cindex GNU SED
+The scripts will commonly use your installed shell and other basic
command-line tools (for example, AWK or SED).
+Different systems have different versions and implementations of these basic
tools (for example, GNU/Linux systems use GNU Bash, GNU AWK and GNU SED which
are far more advanced and up to date then the minimalist AWK and SED of most
other systems).
+Therefore, unexpected errors in these tools might come up when you run these
scripts on non-GNU/Linux operating systems.
+If you do confront such strange errors, please submit a bug report so we fix
it as soon as possible (see @ref{Report a bug}).
-This installed script will read a FITS image and will use it as the basis for
constructing the radial profile.
-The output radial profile is a table (FITS or plain-text) containing the
radial distance from the center in the first row and the specified measurements
in the other columns (mean, median, sigclip-mean, sigclip-median, etc.).
+@end itemize
-To measure the radial profile, this script needs to generate temporary files.
-All these temporary files will be created within the directory given to the
@option{--tmpdir} option.
-When @option{--tmpdir} is not called, a temporary directory (with a name based
on the inputs) will be created in the running directory.
-If the directory does not exist at run-time, this script will create it.
-After the output is created, this script will delete the directory by default,
unless you call the @option{--keeptmp} option.
+@menu
+* Sort FITS files by night:: Sort many files by date.
+* Generate radial profile:: Radial profile of an object in an image.
+* SAO DS9 region files from table:: Create ds9 region file from a table.
+* Viewing FITS file contents with DS9 or TOPCAT:: Open DS9 (images/cubes) or
TOPCAT (tables).
+* Zero point estimation:: Zero point of an image from reference catalog
or image(s).
+* Dithering pattern simulation:: Simulate a stack with a certain dithering
pattern.
+* PSF construction and subtraction:: Set of scripts to create extended PSF of
an image.
+@end menu
-With the default options, the script will generate a circular radial profile
using the mean value and centered at the center of the image.
-In order to have more flexibility, several options are available to configure
for the desired radial profile.
-In this sense, you can change the center position, the maximum radius, the
axis ratio and the position angle (elliptical apertures are considered), the
operator for obtaining the profiles, and others (described below).
+@node Sort FITS files by night, Generate radial profile, Installed scripts,
Installed scripts
+@section Sort FITS files by night
-@cartouche
-@noindent
-@strong{Debug your profile:} to debug your results, especially close to the
center of your object, you can see the radial distance associated to every
pixel in your input.
-To do this, use @option{--keeptmp} to keep the temporary files, and compare
@file{crop.fits} (crop of your input image centered on your desired coordinate)
with @file{apertures.fits} (radial distance of each pixel).
-@end cartouche
+@cindex Calendar
+FITS images usually contain (several) keywords for preserving important dates.
+In particular, for lower-level data, this is usually the observation date and
time (for example, stored in the @code{DATE-OBS} keyword value).
+When analyzing observed datasets, many calibration steps (like the dark, bias
or flat-field), are commonly calculated on a per-observing-night basis.
-@cartouche
-@noindent
-@strong{Finding properties of your elliptical target: } you want to measure
the radial profile of a galaxy, but do not know its exact location, position
angle or axis ratio.
-To obtain these values, you can use @ref{NoiseChisel} to detect signal in the
image, feed it to @ref{Segment} to do basic segmentation, then use
@ref{MakeCatalog} to measure the center (@option{--x} and @option{--y} in
MakeCatalog), axis ratio (@option{--axis-ratio}) and position angle
(@option{--position-angle}).
-@end cartouche
+However, the FITS standard's date format (@code{YYYY-MM-DDThh:mm:ss.ddd}) is
based on the western (Gregorian) calendar.
+Dates that are stored in this format are complicated for automatic processing:
a night starts in the final hours of one calendar day, and extends to the early
hours of the next calendar day.
+As a result, to identify datasets from one night, we commonly need to search
for two dates.
+However calendar peculiarities can make this identification very difficult.
+For example, when an observation is done on the night separating two months
(like the night starting on March 31st and going into April 1st), or two years
(like the night starting on December 31st 2018 and going into January 1st,
2019).
+To account for such situations, it is necessary to keep track of how many days
are in a month, and leap years, etc.
+
+@cindex Unix epoch time
+@cindex Time, Unix epoch
+@cindex Epoch, Unix time
+Gnuastro's @file{astscript-sort-by-night} script is created to help in such
important scenarios.
+It uses @ref{Fits} to convert the FITS date format into the Unix epoch time
(number of seconds since 00:00:00 of January 1st, 1970), using the
@option{--datetosec} option.
+The Unix epoch time is a single number (integer, if not given in sub-second
precision), enabling easy comparison and sorting of dates after January 1st,
1970.
+
+You can use this script as a basis for making a much more highly customized
sorting script.
+Here are some examples
-@cartouche
-@noindent
-@strong{Masking other sources:} The image of an astronomical object will
usually have many other sources with your main target.
-A crude solution is to use sigma-clipped measurements for the profile.
-However, sigma-clipped measurements can easily be biased when the number of
sources at each radial distance increases at larger distances.
-Therefore a robust solution is to mask all other detections within the image.
-You can use @ref{NoiseChisel} and @ref{Segment} to detect and segment the
sources, then set all pixels that do not belong to your target to blank using
@ref{Arithmetic} (in particular, its @code{where} operator).
-@end cartouche
+@itemize
+@item
+If you need to copy the files, but only need a single extension (not the whole
file), you can add a step just before the making of the symbolic links, or
copies, and change it to only copy a certain extension of the FITS file using
the Fits program's @option{--copy} option, see @ref{HDU information and
manipulation}.
-@table @option
-@item -h STR
-@itemx --hdu=STR
-The HDU/extension of the input image to use.
+@item
+If you need to classify the files with finer detail (for example, the purpose
of the dataset), you can add a step just before the making of the symbolic
links, or copies, to specify a file-name prefix based on other certain keyword
values in the files.
+For example, when the FITS files have a keyword to specify if the dataset is a
science, bias, or flat-field image.
+You can read it and to add a @code{sci-}, @code{bias-}, or @code{flat-} to the
created file (after the @option{--prefix}) automatically.
-@item -o STR
-@itemx --output=STR
-Filename of measured radial profile.
-It can be either a FITS table, or plain-text table (determined from your given
file name suffix).
+For example, let's assume the observing mode is stored in the hypothetical
@code{MODE} keyword, which can have three values of @code{BIAS-IMAGE},
@code{SCIENCE-IMAGE} and @code{FLAT-EXP}.
+With the step below, you can generate a mode-prefix, and add it to the
generated link/copy names (just correct the filename and extension of the first
line to the script's variables):
-@item -c FLT[,FLT[,...]]
-@itemx --center=FLT[,FLT[,...]]
-The central position of the radial profile.
-This option is used for placing the center of the profiles.
-This parameter is used in @ref{Crop} to center and crop the region.
-The positions along each dimension must be separated by a comma (@key{,}) and
fractions are also acceptable.
-The number of values given to this option must be the same as the dimensions
of the input dataset.
-The units of the coordinates are read based on the value to the
@option{--mode} option, see below.
+@example
+modepref=$(astfits infile.fits -h1 \
+ | sed -e"s/'/ /g" \
+ | awk '$1=="MODE"@{ \
+ if($3=="BIAS-IMAGE") print "bias-"; \
+ else if($3=="SCIENCE-IMAGE") print "sci-"; \
+ else if($3==FLAT-EXP) print "flat-"; \
+ else print $3, "NOT recognized"; exit 1@}')
+@end example
-@item -O STR
-@itemx --mode=STR
-Interpret the center position of the object (values given to
@option{--center}) in image or WCS coordinates.
-This option thus accepts only two values: @option{img} or @option{wcs}.
-By default, it is @option{--mode=img}.
+@cindex GNU AWK
+@cindex GNU Sed
+Here is a description of it.
+We first use @command{astfits} to print all the keywords in extension @code{1}
of @file{infile.fits}.
+In the FITS standard, string values (that we are assuming here) are placed in
single quotes (@key{'}) which are annoying in this context/use-case.
+Therefore, we pipe the output of @command{astfits} into @command{sed} to
remove all such quotes (substituting them with a blank space).
+The result is then piped to AWK for giving us the final mode-prefix: with
@code{$1=="MODE"}, we ask AWK to only consider the line where the first column
is @code{MODE}.
+There is an equal sign between the key name and value, so the value is the
third column (@code{$3} in AWK).
+We thus use a simple @code{if-else} structure to look into this value and
print our custom prefix based on it.
+The output of AWK is then stored in the @code{modepref} shell variable which
you can add to the link/copy name.
-@item -R FLT
-@itemx --rmax=FLT
-Maximum radius for the radial profile (in pixels).
-By default, the radial profile will be computed up to a radial distance equal
to the maximum radius that fits into the image (assuming circular shape).
+With the solution above, the increment of the file counter for each night will
be independent of the mode.
+If you want the counter to be mode-dependent, you can add a different counter
for each mode and use that counter instead of the generic counter for each
night (based on the value of @code{modepref}).
+But we will leave the implementation of this step to you as an exercise.
-@item -P INT
-@itemx --precision=INT
-The precision (number of digits after the decimal point) in resolving the
radius.
-The default value is @option{--precision=0} (or @option{-P0}), and the value
cannot be larger than @option{6}.
-A higher precision is primarily useful when the very central few pixels are
important for you.
-A larger precision will over-resolve larger radial regions, causing scatter to
significantly affect the measurements.
+@end itemize
-For example, in the command below, we will generate the radial profile of an
imaginary source (at RA,DEC of 1.23,4.567) and check the output without setting
a precision:
+@menu
+* Invoking astscript-sort-by-night:: Inputs and outputs to this script.
+@end menu
-@example
-$ astscript-radial-profile image.fits --center=1.23,4.567 \
- --mode=wcs --measure=mean,area --rmax=10 \
- --output=radial.fits --quiet
-$ asttable radial.fits --head=10 -ffixed -p4
-0.0000 0.0139 1
-1.0000 0.0048 8
-2.0000 0.0023 16
-3.0000 0.0015 20
-4.0000 0.0011 24
-5.0000 0.0008 40
-6.0000 0.0006 36
-7.0000 0.0005 48
-8.0000 0.0004 56
-9.0000 0.0003 56
-@end example
+@node Invoking astscript-sort-by-night, , Sort FITS files by night, Sort FITS
files by night
+@subsection Invoking astscript-sort-by-night
-Let's repeat the command above, but use a precision of 3 to resolve more finer
details of the radial profile, while only printing the top 10 rows of the
profile:
+This installed script will read a FITS date formatted value from the given
keyword, and classify the input FITS files into individual nights.
+For more on installed scripts please see (see @ref{Installed scripts}).
+This script can be used with the following general template:
@example
-$ astscript-radial-profile image.fits --center=1.23,4.567 \
- --mode=wcs --measure=mean,area --rmax=10 \
- --precision=3 --output=radial.fits --quiet
-$ asttable radial.fits --head=10 -ffixed -p4
-0.0000 0.0139 1
-1.0000 0.0056 4
-1.4140 0.0040 4
-2.0000 0.0027 4
-2.2360 0.0024 8
-2.8280 0.0018 4
-3.0000 0.0017 4
-3.1620 0.0016 8
-3.6050 0.0013 8
-4.0000 0.0011 4
+$ astscript-sort-by-night [OPTION...] FITS-files
@end example
-Do you see how many more radii have been added?
-Between 1.0 and 2.0, we now have one extra radius, between 2.0 to 3.0, we have
two new radii and so on.
-If you go to larger and larger radii, you will notice that they get resolved
into many sub-components and the number of pixels used in each measurement will
not be significant (you can already see that in the comparison above).
-This has two problems:
-1. statistically, the scatter in larger radii (where the signal-to-noise ratio
is usually low will make it hard to interpret the profile.
-2. technically, the output table will have many more rows!
-
-@cartouche
@noindent
-@strong{Use higher precision only for small radii:} If you want to look at the
whole profile (or the outer parts!), don't set the precision, the default mode
is usually more than enough!
-But when you are targeting the very central few pixels (usually less than a
pixel radius of 5), use a higher precision.
-@end cartouche
-
-@item -v INT
-@itemx --oversample=INT
-Oversample the input dataset to the fraction given to this option.
-Therefore if you set @option{--rmax=20} for example, and
@option{--oversample=5}, your output will have 100 rows (without
@option{--oversample} it will only have 20 rows).
-Unless the object is heavily undersampled (the pixels are larger than the
actual object), this method provides a much more accurate result and there are
sufficient number of pixels to get the profile accurately.
+One line examples:
-Due to the discrete nature of pixels, if you use this option to oversample
your profile, set @option{--precision=0}.
-Otherwise, your profile will become step-like (with several radii having a
single value).
+@example
+## Use the DATE-OBS keyword
+$ astscript-sort-by-night --key=DATE-OBS /path/to/data/*.fits
-@item -u INT
-@itemx --undersample=INT
-Undersample the input dataset by the number given to this option.
-This option is for considering larger apertures than the original pixel size
(aperture size is equal to 1 pixel).
-For example, if a radial profile computed by default has 100 different radii
(apertures of 1 pixel width), by considering @option{--undersample=2} the
radial profile will be computed over apertures of 2 pixels, so the final radial
profile will have 50 different radii.
-This option is good to measure over a larger number of pixels to improve the
measurement.
+## Make links to the input files with the `img-' prefix
+$ astscript-sort-by-night --link --prefix=img- /path/to/data/*.fits
+@end example
-@item -Q FLT
-@itemx --axis-ratio=FLT
-The axis ratio of the apertures (minor axis divided by the major axis in a 2D
ellipse).
-By default (when this option is not given), the radial profile will be
circular (axis ratio of 1).
-This parameter is used as the option @option{--qcol} in the generation of the
apertures with @command{astmkprof}.
+This script will look into a HDU/extension (@option{--hdu}) for a keyword
(@option{--key}) in the given FITS files and interpret the value as a date.
+The inputs will be separated by "night"s (11:00a.m to next day's 10:59:59a.m,
spanning two calendar days, exact hour can be set with @option{--hour}).
-@item -p FLT
-@itemx --position-angle=FLT
-The position angle (in degrees) of the profiles relative to the first FITS
axis (horizontal when viewed in SAO DS9).
-By default, it is @option{--position-angle=0}, which means that the semi-major
axis of the profiles will be parallel to the first FITS axis.
+The default output is a list of all the input files along with the following
two columns: night number and file number in that night (sorted by time).
+With @option{--link} a symbolic link will be made (one for each input) that
contains the night number, and number of file in that night (sorted by time),
see the description of @option{--link} for more.
+When @option{--copy} is used instead of a link, a copy of the inputs will be
made instead of symbolic link.
-@item -a FLT,FLT
-@itemx --azimuth=FLT,FLT
-@cindex Wedge (radial profile)
-@cindex Azimuthal range (radial profile)
-Limit the profile to the given azimuthal angle range (two numbers given to
this option, in degrees, from 0 to 360) from the major axis (defined by
@option{--position-angle}).
-The radial profile will therefore be created on a wedge-like shape, not the
full circle/ellipse.
-The pixel containing the center of the profile will always be included in the
profile (because it contains all azimuthal angles!).
+Below you can see one example where all the @file{target-*.fits} files in the
@file{data} directory should be separated by observing night according to the
@code{DATE-OBS} keyword value in their second extension (number @code{1},
recall that HDU counting starts from 0).
+You can see the output after the @code{ls} command.
-If the first angle is @emph{smaller} than the second (for example,
@option{--azimuth=10,80}), the region between, or @emph{inside}, the two angles
will be used.
-Otherwise (for example, @option{--azimuth=80,10}), the region @emph{outside}
the two angles will be used.
-The latter case can be useful when you want to ignore part of the 2D shape
(for example, due to a bright star that can be contaminating it).
+@example
+$ astscript-sort-by-night -pimg- -h1 -kDATE-OBS data/target-*.fits
+$ ls
+img-n1-1.fits img-n1-2.fits img-n2-1.fits ...
+@end example
-You can visually see the shape of the region used by running this script with
@option{--keeptmp} and viewing the @file{values.fits} and @file{apertures.fits}
files of the temporary directory with a FITS image viewer like @ref{SAO DS9}.
-You can use @ref{Viewing FITS file contents with DS9 or TOPCAT} to open them
together in one instance of DS9, with both frames matched and locked (for easy
comparison in case you want to zoom-in or out).
-For example, see the commands below (based on your target object, just change
the image name, center, position angle, etc.):
+The outputs can be placed in a different (already existing) directory by
including that directory's name in the @option{--prefix} value, for example,
@option{--prefix=sorted/img-} will put them all under the @file{sorted}
directory.
-@example
-## Generate the radial profile
-$ astscript-radial-profile image.fits --center=1.234,6.789 \
- --mode=wcs --rmax=50 --position-angle=20 \
- --axis-ratio=0.8 --azimuth=95,150 --keeptmp \
- --tmpdir=radial-tmp
+This script can be configured like all Gnuastro's programs (through
command-line options, see @ref{Common options}), with some minor differences
that are described in @ref{Installed scripts}.
+The particular options to this script are listed below:
-## Visually check the values and apertures used.
-$ astscript-fits-view radial-tmp/values.fits \
- radial-tmp/apertures.fits
-@end example
+@table @option
+@item -h STR
+@itemx --hdu=STR
+The HDU/extension to use in all the given FITS files.
+All of the given FITS files must have this extension.
+@item -k STR
+@itemx --key=STR
+The keyword name that contains the FITS date format to classify/sort by.
-@item -m STR
-@itemx --measure=STR
-The operator for measuring the values over each radial distance.
-The values given to this option will be directly passed to @ref{MakeCatalog}.
-As a consequence, all MakeCatalog measurements like the magnitude, magnitude
error, median, mean, signal-to-noise ratio (S/N), std, surface brightness,
sigclip-mean, and sigclip-number can be used here.
-For a full list of MakeCatalog's measurements, please run
@command{astmkcatalog --help} or see @ref{MakeCatalog measurements}.
-Multiple values can be given to this option, each separated by a comma.
-This option can also be called multiple times.
+@item -H FLT
+@itemx --hour=FLT
+The hour that defines the next ``night''.
+By default, all times before 11:00a.m are considered to belong to the previous
calendar night.
+If a sub-hour value is necessary, it should be given in units of hours, for
example, @option{--hour=9.5} corresponds to 9:30a.m.
@cartouche
@noindent
-@strong{Masking background/foreground objects:} For crude rejection of
outliers, you can use sigma-clipping using MakeCatalog measurements like
@option{--sigclip-mean} or @option{--sigclip-mean-sb} (see @ref{MakeCatalog
measurements}).
-To properly mask the effect of background/foreground objects from your target
object's radial profile, you can use @command{astscript-psf-stamp} script, see
@ref{Invoking astscript-psf-stamp}, and feed it the output of @ref{Segment}.
-This script will mask unwanted objects from the image that is later used to
measure the radial profile.
-@end cartouche
-
-Some measurements by MakeCatalog require a per-pixel sky standard deviation
(for example, magnitude error or S/N).
-Therefore when asking for such measurements, use the @option{--instd} option
(described below) to specify the per-pixel sky standard deviation over each
pixel.
-For other measurements like the magnitude or surface brightness, MakeCatalog
will need a Zero point, which you can set with the @option{--zeropoint} option.
+@cindex Time zone
+@cindex UTC (Universal time coordinate)
+@cindex Universal time coordinate (UTC)
+@strong{Dealing with time zones:}
+The time that is recorded in @option{--key} may be in UTC (Universal Time
Coordinate).
+However, the organization of the images taken during the night depends on the
local time.
+It is possible to take this into account by setting the @option{--hour} option
to the local time in UTC.
-For example, by setting @option{--measure=mean,sigclip-mean --measure=median},
the mean, sigma-clipped mean and median values will be computed.
-The output radial profile will have 4 columns in this order: radial distance,
mean, sigma-clipped and median.
-By default (when this option is not given), the mean of all pixels at each
radial position will be computed.
+For example, consider a set of images taken in Auckland (New Zealand, UTC+12)
during different nights.
+If you want to classify these images by night, you have to know at which time
(in UTC time) the Sun rises (or any other separator/definition of a different
night).
+For example, if your observing night finishes before 9:00a.m in Auckland, you
can use @option{--hour=21}.
+Because in Auckland the local time of 9:00 corresponds to 21:00 UTC.
+@end cartouche
-@item -s FLT,FLT
-@itemx --sigmaclip=FLT,FLT
-Sigma clipping parameters: only relevant if sigma-clipping operators are
requested by @option{--measure}.
-For more on sigma-clipping, see @ref{Sigma clipping}.
-If given, the value to this option is directly passed to the
@option{--sigmaclip} option of @ref{MakeCatalog}, see @ref{MakeCatalog inputs
and basic settings}.
-By default (when this option is not given), the default values within
MakeCatalog will be used.
-To see the default value of this option in MakeCatalog, you can run this
command:
+@item -l
+@itemx --link
+Create a symbolic link for each input FITS file.
+This option cannot be used with @option{--copy}.
+The link will have a standard name in the following format (variable parts are
written in @code{CAPITAL} letters and described after it):
@example
-$ astmkcatalog -P | grep " sigmaclip "
+PnN-I.fits
@end example
-@item -z FLT
-@itemx --zeropoint=FLT
-The Zero point of the input dataset.
-This is necessary when you request measurements like magnitude, or surface
brightness.
-
-@item -Z
-@itemx --zeroisnotblank
-Account for zero-valued pixels in the profile.
-By default, such pixels are not considered (when this script crops the
necessary region of the image before generating the profile).
-The long format of this option is identical to a similarly named option in
Crop (see @ref{Invoking astcrop}).
-When this option is called, it is passed directly to Crop, therefore the
zero-valued pixels are not considered as blank and used in the profile creation.
+@table @code
+@item P
+This is the value given to @option{--prefix}.
+By default, its value is @code{./} (to store the links in the directory this
script was run in).
+See the description of @code{--prefix} for more.
+@item N
+This is the night-counter: starting from 1.
+@code{N} is just incremented by 1 for the next night, no matter how many
nights (without any dataset) there are between two subsequent observing nights
(its just an identifier for each night which you can easily map to different
calendar nights).
+@item I
+File counter in that night, sorted by time.
+@end table
-@item -i FLT/STR
-@itemx --instd=FLT/STR
-Sky standard deviation as a single number (FLT) or as the filename (STR)
containing the image with the std value for each pixel (the HDU within the file
should be given to the @option{--stdhdu} option mentioned below).
-This is only necessary when the requested measurement (value given to
@option{--measure}) by MakeCatalog needs the Standard deviation (for example,
the signal-to-noise ratio or magnitude error).
-If your measurements do not require a standard deviation, it is best to ignore
this option (because it will slow down the script).
+@item -c
+@itemx --copy
+Make a copy of each input FITS file with the standard naming convention
described in @option{--link}.
+With this option, instead of making a link, a copy is made.
+This option cannot be used with @option{--link}.
-@item -d INT/STR
-@itemx --stdhdu=INT/STR
-HDU/extension of the sky standard deviation image specified with
@option{--instd}.
+@item -p STR
+@itemx --prefix=STR
+Prefix to append before the night-identifier of each newly created link or
copy.
+This option is thus only relevant with the @option{--copy} or @option{--link}
options.
+See the description of @option{--link} for how it is used.
+For example, with @option{--prefix=img-}, all the created file names in the
current directory will start with @code{img-}, making outputs like
@file{img-n1-1.fits} or @file{img-n3-42.fits}.
-@item -t STR
-@itemx --tmpdir=STR
-Several intermediate files are necessary to obtain the radial profile.
-All of these temporal files are saved into a temporal directory.
-With this option, you can directly specify this directory.
-By default (when this option is not called), it will be built in the running
directory and given an input-based name.
-If the directory does not exist at run-time, this script will create it.
-Once the radial profile has been obtained, this directory is removed.
-You can disable the deletion of the temporary directory with the
@option{--keeptmp} option.
+@option{--prefix} can also be used to store the links/copies in another
directory relative to the directory this script is being run (it must already
exist).
+For example, @code{--prefix=/path/to/processing/img-} will put all the
links/copies in the @file{/path/to/processing} directory, and the files (in
that directory) will all start with @file{img-}.
-@item -k
-@itemx --keeptmp
-Do Not delete the temporary directory (see description of @option{--tmpdir}
above).
-This option is useful for debugging.
-For example, to check that the profiles generated for obtaining the radial
profile have the desired center, shape and orientation.
+@item --stdintimeout=INT
+Number of micro-seconds to wait for standard input within this script.
+This does not correspond to general inputs into the script, inputs to the
script should always be given as a file.
+However, within the script, pipes are often used to pass the output of one
program to another.
+The value given to this option will be passed to those internal pipes.
+When running this script, if you confront an error, saying ``No input!'', you
should be able to fix it by giving a larger number to this option (the default
value is 10000000 micro-seconds or 10 seconds).
@end table
@@ -31111,869 +31301,702 @@ For example, to check that the profiles generated
for obtaining the radial profi
-@node SAO DS9 region files from table, Viewing FITS file contents with DS9 or
TOPCAT, Generate radial profile, Installed scripts
-@section SAO DS9 region files from table
-
-Once your desired catalog (containing the positions of some objects) is
created (for example, with @ref{MakeCatalog}, @ref{Match}, or @ref{Table}) it
often happens that you want to see your selected objects on an image for a
feeling of the spatial properties of your objects.
-For example, you want to see their positions relative to each other.
+@node Generate radial profile, SAO DS9 region files from table, Sort FITS
files by night, Installed scripts
+@section Generate radial profile
-In this section we describe a simple installed script that is provided within
Gnuastro for converting your given columns to an SAO DS9 region file to help in
this process.
-SAO DS9@footnote{@url{http://ds9.si.edu}} is one of the most common FITS image
visualization tools in astronomy and is free software.
+@cindex Radial profile
+@cindex Profile, profile
+The 1 dimensional radial profile of an object is an important parameter in
many aspects of astronomical image processing.
+For example, you want to study how the light of a galaxy is distributed as a
function of the radial distance from the center.
+In other cases, the radial profile of a star can show the PSF (see @ref{PSF}).
+Gnuastro's @file{astscript-radial-profile} script is created to obtain such
radial profiles for one object within an image.
+This script uses @ref{MakeProfiles} to generate elliptical apertures with the
values equal to the distance from the center of the object and
@ref{MakeCatalog} for measuring the values over the apertures.
@menu
-* Invoking astscript-ds9-region:: How to call astscript-ds9-region
+* Invoking astscript-radial-profile:: How to call astscript-radial-profile
@end menu
-@node Invoking astscript-ds9-region, , SAO DS9 region files from table, SAO
DS9 region files from table
-@subsection Invoking astscript-ds9-region
+@node Invoking astscript-radial-profile, , Generate radial profile, Generate
radial profile
+@subsection Invoking astscript-radial-profile
-This installed script will read two positional columns within an input table
and generate an SAO DS9 region file to visualize the position of the given
objects over an image.
+This installed script will measure the radial profile of an object within an
image.
For more on installed scripts please see (see @ref{Installed scripts}).
This script can be used with the following general template:
@example
-## Use the RA and DEC columns of 'table.fits' for the region file.
-$ astscript-ds9-region table.fits --column=RA,DEC \
- --output=ds9.reg
-
-## Select objects with a magnitude between 18 to 20, and generate the
-## region file directly (through a pipe), each region with radius of
-## 0.5 arcseconds.
-$ asttable table.fits --range=MAG,18:20 --column=RA,DEC \
- | astscript-ds9-region --column=1,2 --radius=0.5
-
-## With the first command, select objects with a magnitude of 25 to 26
-## as red regions in 'bright.reg'. With the second command, select
-## objects with a magnitude between 28 to 29 as a green region and
-## show both.
-$ asttable cat.fits --range=MAG_F160W,25:26 -cRA,DEC \
- | astscript-ds9-region -c1,2 --color=red -obright.reg
-$ asttable cat.fits --range=MAG_F160W,28:29 -cRA,DEC \
- | astscript-ds9-region -c1,2 --color=green \
- --command="ds9 image.fits -regions bright.reg"
+$ astscript-radial-profile [OPTION...] FITS-file
@end example
-The input can either be passed as a named file, or from standard input (a
pipe).
-Only the @option{--column} option is mandatory (to specify the input table
columns): two columns from the input table must be specified, either by name
(recommended) or number.
-You can optionally also specify the region's radius, width and color of the
regions with the @option{--radius}, @option{--width} and @option{--color}
options, otherwise default values will be used for these (described under each
option).
-
-The created region file will be written into the file name given to
@option{--output}.
-When @option{--output} is not called, the default name of @file{ds9.reg} will
be used (in the running directory).
-If the file exists before calling this script, it will be overwritten, unless
you pass the @option{--dontdelete} option.
-Optionally you can also use the @option{--command} option to give the full
command that should be run to execute SAO DS9 (see example above and
description below).
-In this mode, the created region file will be deleted once DS9 is closed
(unless you pass the @option{--dontdelete} option).
-A full description of each option is given below.
-
-@table @option
-
-@item -h INT/STR
-@item --hdu INT/STR
-The HDU of the input table when a named FITS file is given as input.
-The HDU (or extension) can be either a name or number (counting from zero).
-For more on this option, see @ref{Input output options}.
-
-@item -c STR,STR
-@itemx --column=STR,STR
-Identifiers of the two positional columns to use in the DS9 region file from
the table.
-They can either be in WCS (RA and Dec) or image (pixel) coordinates.
-The mode can be specified with the @option{--mode} option, described below.
-
-@item -n STR
-@itemx --namecol=STR
-The column containing the name (or label) of each region.
-The type of the column (numeric or a character-based string) is irrelevant:
you can use both types of columns as a name or label for the region.
-This feature is useful when you need to recognize each region with a certain
ID or property (for example, magnitude or redshift).
-
-@item -m wcs|img
-@itemx --mode=wcs|org
-The coordinate system of the positional columns (can be either
@option{--mode=wcs} and @option{--mode=img}).
-In the WCS mode, the values within the columns are interpreted to be RA and
Dec.
-In the image mode, they are interpreted to be pixel X and Y positions.
-This option also affects the interpretation of the value given to
@option{--radius}.
-When this option is not explicitly given, the columns are assumed to be in WCS
mode.
-
-@item -C STR
-@itemx --color=STR
-The color to use for created regions.
-These will be directly interpreted by SAO DS9 when it wants to open the region
file so it must be recognizable by SAO DS9.
-As of SAO DS9 8.2, the recognized color names are @code{black}, @code{white},
@code{red}, @code{green}, @code{blue}, @code{cyan}, @code{magenta} and
@code{yellow}.
-The default color (when this option is not called) is @code{green}
+@noindent
+Examples:
-@item -w INT
-@itemx --width=INT
-The line width of the regions.
-These will be directly interpreted by SAO DS9 when it wants to open the region
file so it must be recognizable by SAO DS9.
-The default value is @code{1}.
+@example
+## Generate the radial profile with default options (assuming the
+## object is in the center of the image, and using the mean).
+$ astscript-radial-profile image.fits
-@item -r FLT
-@itemx --radius=FLT
-The radius of all the regions.
-In WCS mode, the radius is assumed to be in arc-seconds, in image mode, it is
in pixel units.
-If this option is not explicitly given, in WCS mode the default radius is 1
arc-seconds and in image mode it is 3 pixels.
+## Generate the radial profile centered at x=44 and y=37 (in pixels),
+## up to a radial distance of 19 pixels, use the mean value.
+$ astscript-radial-profile image.fits --center=44,37 --rmax=19
-@item --dontdelete
-If the output file name exists, abort the program and do not over-write the
contents of the file.
-This option is thus good if you want to avoid accidentally writing over an
important file.
-Also, do not delete the created region file when @option{--command} is given
(by default, when @option{--command} is given, the created region file will be
deleted after SAO DS9 closes).
+## Generate the radial profile centered at x=44 and y=37 (in pixels),
+## up to a radial distance of 100 pixels, compute sigma clipped
+## mean and standard deviation (sigclip-mean and sigclip-std) using
+## 5 sigma and 0.1 tolerance (default is 3 sigma and 0.2 tolerance).
+$ astscript-radial-profile image.fits --center=44,37 --rmax=100 \
+ --sigmaclip=5,0.1 \
+ --measure=sigclip-mean,sigclip-std
-@item -o STR
-@itemx --output=STR
-Write the created SAO DS9 region file into the name given to this option.
-If not explicitly given on the command-line, a default name of @file{ds9.reg}
will be used.
-If the file already exists, it will be over-written, you can avoid the
deletion (or over-writing) of an existing file with the @option{--dontdelete}.
+## Generate the radial profile centered at RA=20.53751695,
+## DEC=0.9454292263, up to a radial distance of 88 pixels,
+## axis ratio equal to 0.32, and position angle of 148 deg.
+## Name the output table as `radial-profile.fits'
+$ astscript-radial-profile image.fits --mode=wcs \
+ --center=20.53751695,0.9454292263 \
+ --rmax=88 --axis-ratio=0.32 \
+ --position-angle=148 -oradial-profile.fits
-@item --command="STR"
-After creating the region file, run the string given to this option as a
command-line command.
-The SAO DS9 region command will be appended to the end of the given command.
-Because the command will mostly likely contain white-space characters it is
recommended to put the given string in double quotations.
+## Generate the radial profile centered at RA=40.062675270971,
+## DEC=-8.1511992735126, up to a radial distance of 20 pixels,
+## and calculate the SNR using the INPUT-NO-SKY and SKY-STD
+## extensions of the NoiseChisel output file.
+$ astscript-radial-profile image_detected.fits -hINPUT-NO-SKY \
+ --mode=wcs --measure=sn \
+ --center=40.062675270971,-8.1511992735126 \
+ --rmax=20 --stdhdu=SKY_STD
-For example, let's assume @option{--command="ds9 image.fits -zscale"}.
-After making the region file (assuming it is called @file{ds9.reg}), the
following command will be executed:
+## Generate the radial profile centered at RA=40.062675270971,
+## DEC=-8.1511992735126, up to a radial distance of 20 pixels,
+## and compute the SNR with a fixed value for std, std=10.
+$ astscript-radial-profile image.fits -h1 --mode=wcs --rmax=20 \
+ --center=40.062675270971,-8.1511992735126 \
+ --measure=sn --instd=10
-@example
-ds9 image.fits -zscale -regions ds9.reg
+## Generate the radial profile centered at X=1201, Y=1201 pixels, up
+## to a radial distance of 20 pixels and compute the median and the
+## SNR using the first extension of sky-std.fits as the dataset for std
+## values.
+$ astscript-radial-profile image.fits -h1 --mode=img --rmax=20 \
+ --center=1201,1201 --measure=median,sn \
+ --instd=sky-std.fits
@end example
-You can customize all aspects of SAO DS9 with its command-line options,
therefore the value of this option can be as long and complicated as you like.
-For example, if you also want the image to fit into the window, this option
will be: @command{--command="ds9 image.fits -zscale -zoom to fit"}.
-You can see the SAO DS9 command-line descriptions by clicking on the ``Help''
menu and selecting ``Reference Manual''.
-In the opened window, click on ``Command Line Options''.
-@end table
-
-
-
-
-@node Viewing FITS file contents with DS9 or TOPCAT, Zero point estimation,
SAO DS9 region files from table, Installed scripts
-@section Viewing FITS file contents with DS9 or TOPCAT
+This installed script will read a FITS image and will use it as the basis for
constructing the radial profile.
+The output radial profile is a table (FITS or plain-text) containing the
radial distance from the center in the first row and the specified measurements
in the other columns (mean, median, sigclip-mean, sigclip-median, etc.).
-@cindex Multi-Extension FITS
-@cindex Opening multi-extension FITS
-The FITS definition allows for multiple extensions (or HDUs) inside one FITS
file.
-Each HDU can have a completely independent dataset inside of it.
-One HDU can be a table, another can be an image and another can be another
independent image.
-For example, each image HDU can be one CCD of a multi-CCD camera, or in
processed images one can be the deep science image and the next can be its
weight map, alternatively, one HDU can be an image, and another can be the
catalog/table of objects within it.
+To measure the radial profile, this script needs to generate temporary files.
+All these temporary files will be created within the directory given to the
@option{--tmpdir} option.
+When @option{--tmpdir} is not called, a temporary directory (with a name based
on the inputs) will be created in the running directory.
+If the directory does not exist at run-time, this script will create it.
+After the output is created, this script will delete the directory by default,
unless you call the @option{--keeptmp} option.
-The most common software for viewing FITS images is SAO DS9 (see @ref{SAO
DS9}) and for plotting tables, TOPCAT is the most commonly used tool in
astronomy (see @ref{TOPCAT}).
-After installing them (as described in the respective appendix linked in the
previous sentence), you can open any number of FITS images or tables with DS9
or TOPCAT with the commands below:
+With the default options, the script will generate a circular radial profile
using the mean value and centered at the center of the image.
+In order to have more flexibility, several options are available to configure
for the desired radial profile.
+In this sense, you can change the center position, the maximum radius, the
axis ratio and the position angle (elliptical apertures are considered), the
operator for obtaining the profiles, and others (described below).
-@example
-$ ds9 image-a.fits image-b.fits
-$ topcat table-a.fits table-b.fits
-@end example
+@cartouche
+@noindent
+@strong{Debug your profile:} to debug your results, especially close to the
center of your object, you can see the radial distance associated to every
pixel in your input.
+To do this, use @option{--keeptmp} to keep the temporary files, and compare
@file{crop.fits} (crop of your input image centered on your desired coordinate)
with @file{apertures.fits} (radial distance of each pixel).
+@end cartouche
-But usually the default mode is not enough.
-For example, in DS9, the window can be too small (not covering the height of
your monitor), you probably want to match and lock multiple images, you have a
favorite color map that you prefer to use, or you may want to open a
multi-extension FITS file as a cube.
+@cartouche
+@noindent
+@strong{Finding properties of your elliptical target: } you want to measure
the radial profile of a galaxy, but do not know its exact location, position
angle or axis ratio.
+To obtain these values, you can use @ref{NoiseChisel} to detect signal in the
image, feed it to @ref{Segment} to do basic segmentation, then use
@ref{MakeCatalog} to measure the center (@option{--x} and @option{--y} in
MakeCatalog), axis ratio (@option{--axis-ratio}) and position angle
(@option{--position-angle}).
+@end cartouche
-Using the simple commands above, you need to manually do all these in the DS9
window once it opens and this can take several tens of seconds (which is enough
to distract you from what you wanted to inspect).
-For example, if you have a multi-extension file containing 2D images, one way
to load and switch between each 2D extension is to take the following steps in
the SAO DS9 window: @clicksequence{``File''@click{}``Open Other''@click{}``Open
Multi Ext Cube''} and then choose the Multi extension FITS file in your
computer's file structure.
+@cartouche
+@noindent
+@strong{Masking other sources:} The image of an astronomical object will
usually have many other sources with your main target.
+A crude solution is to use sigma-clipped measurements for the profile.
+However, sigma-clipped measurements can easily be biased when the number of
sources at each radial distance increases at larger distances.
+Therefore a robust solution is to mask all other detections within the image.
+You can use @ref{NoiseChisel} and @ref{Segment} to detect and segment the
sources, then set all pixels that do not belong to your target to blank using
@ref{Arithmetic} (in particular, its @code{where} operator).
+@end cartouche
-@cindex @option{-mecube} (DS9)
-The method above is a little tedious to do every time you want view a
multi-extension FITS file.
-A different series of steps is also necessary if you the extensions are 3D
data cubes (since they are already cubes, and should be opened as multi-frame).
-Furthermore, if you have multiple images and want to ``match'' and ``lock''
them (so when you zoom-in to one, all get zoomed-in) you will need several
other sequence of menus and clicks.
+@table @option
+@item -h STR
+@itemx --hdu=STR
+The HDU/extension of the input image to use.
-Fortunately SAO DS9 also provides command-line options that you can use to
specify a particular behavior before/after opening a file.
-One of those options is @option{-mecube} which opens a FITS image as a
multi-extension data cube (treating each 2D extension as a slice in a 3D cube).
-This allows you to flip through the extensions easily while keeping all the
settings similar.
-Just to avoid confusion, note that SAO DS9 does not follow the GNU style of
separating long and short options as explained in @ref{Arguments and options}.
-In the GNU style, this `long' (multi-character) option should have been called
like @option{--mecube}, but SAO DS9 follows its own conventions.
+@item -o STR
+@itemx --output=STR
+Filename of measured radial profile.
+It can be either a FITS table, or plain-text table (determined from your given
file name suffix).
-For example, try running @command{$ds9 -mecube foo.fits} to see the effect
(for example, on the output of @ref{NoiseChisel}).
-If the file has multiple extensions, a small window will also be opened along
with the main DS9 window.
-This small window allows you to slide through the image extensions of
@file{foo.fits}.
-If @file{foo.fits} only consists of one extension, then SAO DS9 will open as
usual.
+@item -c FLT[,FLT[,...]]
+@itemx --center=FLT[,FLT[,...]]
+The central position of the radial profile.
+This option is used for placing the center of the profiles.
+This parameter is used in @ref{Crop} to center and crop the region.
+The positions along each dimension must be separated by a comma (@key{,}) and
fractions are also acceptable.
+The number of values given to this option must be the same as the dimensions
of the input dataset.
+The units of the coordinates are read based on the value to the
@option{--mode} option, see below.
-On the other hand, for visualizing the contents of tables (that are also
commonly stored in the FITS format), you need to call a different software
(most commonly, people use TOPCAT, see @ref{TOPCAT}).
-And to make things more inconvenient, by default both of these are only
installed as command-line software, so while you are navigating in your GUI,
you need to open a terminal there, and run these commands.
-All of the issues above are the founding purpose of the installed script that
is introduced in @ref{Invoking astscript-fits-view}.
+@item -O STR
+@itemx --mode=STR
+Interpret the center position of the object (values given to
@option{--center}) in image or WCS coordinates.
+This option thus accepts only two values: @option{img} or @option{wcs}.
+By default, it is @option{--mode=img}.
-@menu
-* Invoking astscript-fits-view:: How to call this script
-@end menu
+@item -R FLT
+@itemx --rmax=FLT
+Maximum radius for the radial profile (in pixels).
+By default, the radial profile will be computed up to a radial distance equal
to the maximum radius that fits into the image (assuming circular shape).
-@node Invoking astscript-fits-view, , Viewing FITS file contents with DS9 or
TOPCAT, Viewing FITS file contents with DS9 or TOPCAT
-@subsection Invoking astscript-fits-view
+@item -P INT
+@itemx --precision=INT
+The precision (number of digits after the decimal point) in resolving the
radius.
+The default value is @option{--precision=0} (or @option{-P0}), and the value
cannot be larger than @option{6}.
+A higher precision is primarily useful when the very central few pixels are
important for you.
+A larger precision will over-resolve larger radial regions, causing scatter to
significantly affect the measurements.
-Given any number of FITS files, this script will either open SAO DS9 (for
images or cubes) or TOPCAT (for tables) to visualize their contents in a
graphic user interface (GUI).
-For more on installed scripts please see (see @ref{Installed scripts}).
-This script can be used with the following general template:
+For example, in the command below, we will generate the radial profile of an
imaginary source (at RA,DEC of 1.23,4.567) and check the output without setting
a precision:
@example
-$ astscript-fits-view [OPTION] input.fits [input-b.fits ...]
+$ astscript-radial-profile image.fits --center=1.23,4.567 \
+ --mode=wcs --measure=mean,area --rmax=10 \
+ --output=radial.fits --quiet
+$ asttable radial.fits --head=10 -ffixed -p4
+0.0000 0.0139 1
+1.0000 0.0048 8
+2.0000 0.0023 16
+3.0000 0.0015 20
+4.0000 0.0011 24
+5.0000 0.0008 40
+6.0000 0.0006 36
+7.0000 0.0005 48
+8.0000 0.0004 56
+9.0000 0.0003 56
@end example
-@noindent
-One line examples
+Let's repeat the command above, but use a precision of 3 to resolve more finer
details of the radial profile, while only printing the top 10 rows of the
profile:
@example
-## Call TOPCAT to load all the input FITS tables.
-$ astscript-fits-view table-*.fits
-
-## Call SAO DS9 to open all the input FITS images.
-$ astscript-fits-view image-*.fits
+$ astscript-radial-profile image.fits --center=1.23,4.567 \
+ --mode=wcs --measure=mean,area --rmax=10 \
+ --precision=3 --output=radial.fits --quiet
+$ asttable radial.fits --head=10 -ffixed -p4
+0.0000 0.0139 1
+1.0000 0.0056 4
+1.4140 0.0040 4
+2.0000 0.0027 4
+2.2360 0.0024 8
+2.8280 0.0018 4
+3.0000 0.0017 4
+3.1620 0.0016 8
+3.6050 0.0013 8
+4.0000 0.0011 4
@end example
-This script will use Gnuastro's @ref{Fits} program to see if the file is a
table or image.
-If the first input file contains an image HDU, then the sequence of files will
be given to @ref{SAO DS9}.
-Otherwise, the input(s) will be given to @ref{TOPCAT} to visualize (plot) as
tables.
-When opening DS9 it will also inspect the dimensionality of the first image
HDU of the first input and open it slightly differently when the input is 2D or
3D:
-
-@table @asis
-@item 2D
-DS9's @option{-mecube} will be used to open all the 2D extensions of each
input file as a ``Multi-extension cube''.
-A ``Cube'' window will also be opened with DS9 that can be used to slide/flip
through each extensions.
-When multiple files are given, each file will be in one ``frame''.
-
-@item 3D
-DS9's @option{-multiframe} option will be used to open all the extensions in a
separate ``frame'' (since each input is already a 3D cube, the @option{-mecube}
option can be confusing).
-To flip through the extensions (while keeping the slice fixed), click the
``frame'' button on the top row of buttons, then use the last four buttons of
the bottom row ("first", "previous", "next" and "last") to change between the
extensions.
-If multiple files are given, there will be a separate frame for each HDU of
each input (each HDU's name or number will be put in square brackets after its
name).
-@end table
+Do you see how many more radii have been added?
+Between 1.0 and 2.0, we now have one extra radius, between 2.0 to 3.0, we have
two new radii and so on.
+If you go to larger and larger radii, you will notice that they get resolved
into many sub-components and the number of pixels used in each measurement will
not be significant (you can already see that in the comparison above).
+This has two problems:
+1. statistically, the scatter in larger radii (where the signal-to-noise ratio
is usually low will make it hard to interpret the profile.
+2. technically, the output table will have many more rows!
@cartouche
@noindent
-@strong{Double-clicking on FITS file to open DS9 or TOPCAT:} for those graphic
user interface (GUI) that follow the freedesktop.org standards (including
GNOME, KDS Plasma, or Xfce) Gnuastro installs a @file{fits-view.desktop} file
to instruct your GUI to call this script for opening FITS files when you click
on them.
-To activate this feature take the following steps:
-@enumerate
-@item
-Run the following command, while replacing @code{PREFIX}.
-If you do not know what to put in @code{PREFIX}, run @command{which astfits}
on the command-line, and extract @code{PREFIX} from the output (the string
before @file{/bin/astfits}).
-For more, see @ref{Installation directory}.
-@example
-ln -sf PREFIX/share/gnuastro/astscript-fits-view.desktop \
- ~/.local/share/applications/
-@end example
-@item
-Right-click on a FITS file, and choose these items in order (based on GNOME,
may be different in KDE or Xfce): @clicksequence{``Open with other
application''@click{}``View all applications''@click{}``astscript-fits-view''}.
-@end enumerate
+@strong{Use higher precision only for small radii:} If you want to look at the
whole profile (or the outer parts!), don't set the precision, the default mode
is usually more than enough!
+But when you are targeting the very central few pixels (usually less than a
pixel radius of 5), use a higher precision.
@end cartouche
+@item -v INT
+@itemx --oversample=INT
+Oversample the input dataset to the fraction given to this option.
+Therefore if you set @option{--rmax=20} for example, and
@option{--oversample=5}, your output will have 100 rows (without
@option{--oversample} it will only have 20 rows).
+Unless the object is heavily undersampled (the pixels are larger than the
actual object), this method provides a much more accurate result and there are
sufficient number of pixels to get the profile accurately.
-@noindent
-This script takes the following options
+Due to the discrete nature of pixels, if you use this option to oversample
your profile, set @option{--precision=0}.
+Otherwise, your profile will become step-like (with several radii having a
single value).
-@table @option
+@item -u INT
+@itemx --undersample=INT
+Undersample the input dataset by the number given to this option.
+This option is for considering larger apertures than the original pixel size
(aperture size is equal to 1 pixel).
+For example, if a radial profile computed by default has 100 different radii
(apertures of 1 pixel width), by considering @option{--undersample=2} the
radial profile will be computed over apertures of 2 pixels, so the final radial
profile will have 50 different radii.
+This option is good to measure over a larger number of pixels to improve the
measurement.
-@item -h STR
-@itemx --hdu=STR
-The HDU (extension) of the input dataset to display.
-The value can be the HDU name or number (the first HDU is counted from 0).
+@item -Q FLT
+@itemx --axis-ratio=FLT
+The axis ratio of the apertures (minor axis divided by the major axis in a 2D
ellipse).
+By default (when this option is not given), the radial profile will be
circular (axis ratio of 1).
+This parameter is used as the option @option{--qcol} in the generation of the
apertures with @command{astmkprof}.
-@item -p STR
-@itemx --prefix=STR
-Directory to search for SAO DS9 or TOPCAT's executables (assumed to be
@command{ds9} and @command{topcat}).
-If not called they will be assumed to be present in your @file{PATH} (see
@ref{Installation directory}).
-If you do not have them already installed, their installation directories are
available in @ref{SAO DS9} and @ref{TOPCAT} (they can be installed in
non-system-wide locations that do not require administrator/root permissions).
+@item -p FLT
+@itemx --position-angle=FLT
+The position angle (in degrees) of the profiles relative to the first FITS
axis (horizontal when viewed in SAO DS9).
+By default, it is @option{--position-angle=0}, which means that the semi-major
axis of the profiles will be parallel to the first FITS axis.
-@item -s STR
-@itemx --ds9scale=STR
-The string to give to DS9's @option{-scale} option.
-You can use this option to use a different scaling.
-The Fits-view script will place @option{-scale} before your given string when
calling DS9.
-If you do not call this option, the default behavior is to cal DS9 with:
@option{-scale mode zscale} or @option{--ds9scale="mode zscale"} when using
this script.
+@item -a FLT,FLT
+@itemx --azimuth=FLT,FLT
+@cindex Wedge (radial profile)
+@cindex Azimuthal range (radial profile)
+Limit the profile to the given azimuthal angle range (two numbers given to
this option, in degrees, from 0 to 360) from the major axis (defined by
@option{--position-angle}).
+The radial profile will therefore be created on a wedge-like shape, not the
full circle/ellipse.
+The pixel containing the center of the profile will always be included in the
profile (because it contains all azimuthal angles!).
-The Fits-view script has the following aliases to simplify the calling of this
option (and avoid the double-quotations and @code{mode} in the example above):
+If the first angle is @emph{smaller} than the second (for example,
@option{--azimuth=10,80}), the region between, or @emph{inside}, the two angles
will be used.
+Otherwise (for example, @option{--azimuth=80,10}), the region @emph{outside}
the two angles will be used.
+The latter case can be useful when you want to ignore part of the 2D shape
(for example, due to a bright star that can be contaminating it).
-@table @option
-@item zscale
-or @option{--ds9scale=zscale} equivalent to @option{--ds9scale="mode zscale"}.
-@item minmax
-or @option{--ds9scale=minmax} equivalent to @option{--ds9scale="mode minmax"}.
-@end table
+You can visually see the shape of the region used by running this script with
@option{--keeptmp} and viewing the @file{values.fits} and @file{apertures.fits}
files of the temporary directory with a FITS image viewer like @ref{SAO DS9}.
+You can use @ref{Viewing FITS file contents with DS9 or TOPCAT} to open them
together in one instance of DS9, with both frames matched and locked (for easy
comparison in case you want to zoom-in or out).
+For example, see the commands below (based on your target object, just change
the image name, center, position angle, etc.):
-@item -c=FLT,FLT
-@itemx --ds9center=FLT,FLT
-The central coordinate for DS9's view of the FITS image after it opens.
-This is equivalent to the ``Pan'' button in DS9.
-The nature of the coordinates will be determined by the @option{--ds9mode}
option that is described below.
+@example
+## Generate the radial profile
+$ astscript-radial-profile image.fits --center=1.234,6.789 \
+ --mode=wcs --rmax=50 --position-angle=20 \
+ --axis-ratio=0.8 --azimuth=95,150 --keeptmp \
+ --tmpdir=radial-tmp
-@item -O img/wcs
-@itemx --ds9mode=img/wcs
-The coordinate system (or mode) to interpret the values given to
@option{--ds9center}.
-This can either be @option{img} (or DS9's ``Image'' coordinates) or
@option{wcs} (or DS9's ``wcs fk5'' coordinates).
+## Visually check the values and apertures used.
+$ astscript-fits-view radial-tmp/values.fits \
+ radial-tmp/apertures.fits
+@end example
-@item -g INTxINT
-@itemx --ds9geometry=INTxINT
-The initial DS9 window geometry (value to DS9's @option{-geometry} option).
-@item -m
-@itemx --ds9colorbarmulti
-Do Not show a single color bar for all the loaded images.
-By default this script will call DS9 in a way that a single color bar is shown
for any number of images.
-A single color bar is preferred for two reasons: 1) when there are a lot of
images, they consume a large fraction of the display area. 2) the color-bars
are locked by this script, so there is no difference between!
-With this option, you can have separate color bars under each image.
-@end table
+@item -m STR
+@itemx --measure=STR
+The operator for measuring the values over each radial distance.
+The values given to this option will be directly passed to @ref{MakeCatalog}.
+As a consequence, all MakeCatalog measurements like the magnitude, magnitude
error, median, mean, signal-to-noise ratio (S/N), std, surface brightness,
sigclip-mean, and sigclip-number can be used here.
+For a full list of MakeCatalog's measurements, please run
@command{astmkcatalog --help} or see @ref{MakeCatalog measurements}.
+Multiple values can be given to this option, each separated by a comma.
+This option can also be called multiple times.
+
+@cartouche
+@noindent
+@strong{Masking background/foreground objects:} For crude rejection of
outliers, you can use sigma-clipping using MakeCatalog measurements like
@option{--sigclip-mean} or @option{--sigclip-mean-sb} (see @ref{MakeCatalog
measurements}).
+To properly mask the effect of background/foreground objects from your target
object's radial profile, you can use @command{astscript-psf-stamp} script, see
@ref{Invoking astscript-psf-stamp}, and feed it the output of @ref{Segment}.
+This script will mask unwanted objects from the image that is later used to
measure the radial profile.
+@end cartouche
+
+Some measurements by MakeCatalog require a per-pixel sky standard deviation
(for example, magnitude error or S/N).
+Therefore when asking for such measurements, use the @option{--instd} option
(described below) to specify the per-pixel sky standard deviation over each
pixel.
+For other measurements like the magnitude or surface brightness, MakeCatalog
will need a Zero point, which you can set with the @option{--zeropoint} option.
+For example, by setting @option{--measure=mean,sigclip-mean --measure=median},
the mean, sigma-clipped mean and median values will be computed.
+The output radial profile will have 4 columns in this order: radial distance,
mean, sigma-clipped and median.
+By default (when this option is not given), the mean of all pixels at each
radial position will be computed.
+@item -s FLT,FLT
+@itemx --sigmaclip=FLT,FLT
+Sigma clipping parameters: only relevant if sigma-clipping operators are
requested by @option{--measure}.
+For more on sigma-clipping, see @ref{Sigma clipping}.
+If given, the value to this option is directly passed to the
@option{--sigmaclip} option of @ref{MakeCatalog}, see @ref{MakeCatalog inputs
and basic settings}.
+By default (when this option is not given), the default values within
MakeCatalog will be used.
+To see the default value of this option in MakeCatalog, you can run this
command:
+@example
+$ astmkcatalog -P | grep " sigmaclip "
+@end example
+@item -z FLT
+@itemx --zeropoint=FLT
+The Zero point of the input dataset.
+This is necessary when you request measurements like magnitude, or surface
brightness.
-@c Update the ``previous'' and next items: C-c C-u C-e
-@c Update the menu: C-u C-c C-u m
-@node Zero point estimation, Dithering pattern simulation, Viewing FITS file
contents with DS9 or TOPCAT, Installed scripts
-@section Zero point estimation
+@item -Z
+@itemx --zeroisnotblank
+Account for zero-valued pixels in the profile.
+By default, such pixels are not considered (when this script crops the
necessary region of the image before generating the profile).
+The long format of this option is identical to a similarly named option in
Crop (see @ref{Invoking astcrop}).
+When this option is called, it is passed directly to Crop, therefore the
zero-valued pixels are not considered as blank and used in the profile creation.
-@cindex Zero point
-@cindex Calibration
-@cindex Astrometry
-Through the ``zero point'', we are able to give physical units to the pixel
values of an image (often in units of ``counts'' or ADUs) and thus compare them
with other images (as well as measurements that are done on them).
-The zero point is therefore an important calibration of pixel values (as
astromerty is a calibration of the pixel positions).
-The fundamental concepts behind the zero point are described in
@ref{Brightness flux magnitude}.
-We will therefore not go deeper into the basics here and stick to the
practical aspects of it.
+@item -i FLT/STR
+@itemx --instd=FLT/STR
+Sky standard deviation as a single number (FLT) or as the filename (STR)
containing the image with the std value for each pixel (the HDU within the file
should be given to the @option{--stdhdu} option mentioned below).
+This is only necessary when the requested measurement (value given to
@option{--measure}) by MakeCatalog needs the Standard deviation (for example,
the signal-to-noise ratio or magnitude error).
+If your measurements do not require a standard deviation, it is best to ignore
this option (because it will slow down the script).
-The purpose of Gnuastro’s @command{astscript-zeropoint} script is to obtain
the zero point of an image by considering another image (where the zero point
is already known), or a catalog.
-In the
-The operation involves multiple lower-level programs in a standard series of
steps.
-For example, when using another image, the script will take the following
steps:
+@item -d INT/STR
+@itemx --stdhdu=INT/STR
+HDU/extension of the sky standard deviation image specified with
@option{--instd}.
-@enumerate
-@item
-Download the Gaia catalog that overlaps with the input image using Gnuastro’s
Query program (see @ref{Query}).
-This is done to determine the stars within the image@footnote{Stars have an
almost identical shape in the image (as opposed to galaxies for example), using
confirmed stars will produce a more reliable result.}.
-@item
-Perform aperture photometry@footnote{For a complete tutorial on aperture
photometry, see @ref{Aperture photometry}.} with @ref{MakeProfiles}
@ref{MakeCatalog}.
-We will assume a zero point of 0 for the input image.
-If the reference is an image, then we should perform aperture photometry also
in that image.
-@item
-Match the two catalogs@footnote{For a tutorial on matching catalogs, see
@ref{Matching catalogs}).} with @ref{Match}.
-@item
-The difference between the input and reference magnitudes should be
independent of the magnitude of the stars.
-This does not hold when the stars are saturated in one/both the images (giving
us a bright-limit for the magnitude range to use) or for stars fainter than a
certain magnitude, where the signal-to-noise ratio drops significantly in
one/both images (giving us a faint limit for the magnitude range to use).
-@item
-Since a zero point of 0 was used for the input image, the magnitude difference
above (in the reliable magnitude range) is the zero point of the input image.
-@end enumerate
+@item -t STR
+@itemx --tmpdir=STR
+Several intermediate files are necessary to obtain the radial profile.
+All of these temporal files are saved into a temporal directory.
+With this option, you can directly specify this directory.
+By default (when this option is not called), it will be built in the running
directory and given an input-based name.
+If the directory does not exist at run-time, this script will create it.
+Once the radial profile has been obtained, this directory is removed.
+You can disable the deletion of the temporary directory with the
@option{--keeptmp} option.
-In the sections below we have prepared two tutorials on the use of this script.
-The first uses an image as a reference (@ref{Zero point tutorial with
reference image}) and the second uses a catalog (@ref{Zero point tutorial with
reference catalog}).
-Afterwards, in @ref{Invoking astscript-zeropoint}, the details of all the
options and how to run this script are provided.
+@item -k
+@itemx --keeptmp
+Do Not delete the temporary directory (see description of @option{--tmpdir}
above).
+This option is useful for debugging.
+For example, to check that the profiles generated for obtaining the radial
profile have the desired center, shape and orientation.
+@end table
-@menu
-* Zero point tutorial with reference image:: Using SDSS images to find J-PLUS
zero point
-* Zero point tutorial with reference catalog:: Using SDSS catalog to find
J-PLUS zero point
-* Invoking astscript-zeropoint:: How to call the script
-@end menu
-@node Zero point tutorial with reference image, Zero point tutorial with
reference catalog, Zero point estimation, Zero point estimation
-@subsection Zero point tutorial with reference image
-@cindex SDSS
-In this tutorial on how to use the @command{astscript-zeropoint}, we will find
the zero point of a single exposure image from the @url{https://www.j-plus.es,
J-PLUS survey}, while using an @url{http://www.sdss.org, SDSS} image as
reference (recall that all SDSS images have been calibrated to have a fixed
zero point of 22.5).
-In this case, both images that we are using were taken with the SDSS @emph{r}
filter.
-@cartouche
-@cindex Johnson filters
-@cindex Johnson vs. SDSS filters
-@cindex SDSS vs. Johnson filters
-@cindex Filter transmission curve
-@cindex Transmission curve of filters
-@cindex SVO database (filter transmission curve)
-@noindent
-@strong{Same filters and SVO filter database:} It is very important that both
your images are taken with the same filter.
-When looking at filter names, don't forget that different filter systems
sometimes have the same names for one filter, such as the name ``R''; which is
used in both the Johnson and SDSS filter systems.
-Hence if you confront an image in the ``R'' or ``r'' filter, double check to
see exactly which filter system it corresponds to.
-If you know which observatory your data came from, you can use the
@url{http://svo2.cab.inta-csic.es/theory/fps, SVO database} to confirm the
similarity of the transmission curves of the filters of your input and
reference images.
-SVO contains the filter data for many of the observatories world-wide.
-@end cartouche
-First, let’s create a directory named @file{tutorial-zeropoint} to keep things
clean and work in that.
-Then, with the commands below, you can download an image from J-PLUS and SDSS.
-To speed up the analysis, the image is cropped to have a smaller region around
its center.
-@example
-$ mkdir tutorial-zeropoint
-$ cd tutorial-zeropoint
-$ jplusdr2=http://archive.cefca.es/catalogues/vo/siap/jplus-dr2/reduced
-$ wget $jplusdr2/get_fits?id=771463 -O jplus.fits.fz
-$ astcrop jplus.fits.fz --center=107.7263,40.1754 \
- --width=0.6 --output=jplus-crop.fits
-@end example
-Although we cropped the J-PLUS image, it is still very large in comparison
with the SDSS image (the J-PLUS field of view is almost @mymath{1.5\times1.5}
deg@mymath{^2}, while the field of view of SDSS in each filter is almost
@mymath{0.3\times0.5} deg@mymath{^2}).
-Therefore, let's download two SDSS images (and then decompress them) in the
region of the cropped J-PLUS image to have a more accurate result compared to a
single SDSS footprint: generally, your zero point estimation will have less
scatter with more overlap between your reference image(s) and your input image.
-@example
-$ sdssbase=https://dr12.sdss.org/sas/dr12/boss/photoObj/frames
-$ wget $sdssbase/301/6509/5/frame-r-006509-5-0115.fits.bz2 \
- -O sdss1.fits.bz2
-$ wget $sdssbase/301/6573/5/frame-r-006573-5-0174.fits.bz2 \
- -O sdss2.fits.bz2
-$ bunzip2 sdss1.fits.bz2
-$ bunzip2 sdss2.fits.bz2
-@end example
-To have a feeling of the data, let's open the three images with
@command{astscript-fits-view} using the command below.
-Wait a few seconds to see the three images ``blinking'' one after another.
-The largest one is the J-PLUS crop and the two smaller ones that partially
cover it in different regions are from SDSS.
-@example
-$ astscript-fits-view sdss1.fits sdss2.fits jplus-crop.fits \
- --ds9extra="-lock frame wcs -single -zoom to fit -blink yes"
-@end example
-The test above showed that the three images are already astrometrically
calibrated (the coverage of the pixel positions on the sky is correct in both).
-To confirm, you can zoom-in to a certain object and confirm it on a pixel
level.
-It is always good to do the visual check above when you are confronted with
new images (and may not be confident about the accuracy of the astrometry).
-Do not forget that the goal here is to find the calibration of pixel values;
and that we assume pixel positions are already calibrated (the image already
has a good astrometry).
-The SDSS images are Sky subtracted, while this single-exposure J-PLUS image
still contains the counts related to the Sky emission within them.
-In the J-PLUS survey, the sky-level in each pixel is kept in a separate
@code{BACKGROUND_MODEL} HDU of @file{jplus.fits.fz}; this allows you to use a
different sky if you like.
-The SDSS image FITS files also have multiple extensions.
-To understand our inputs, let's have a fast look at the basic info of each:
-@example
-$ astfits sdss1.fits
-Fits (GNU Astronomy Utilities) @value{VERSION}
-Run on Fri Apr 14 11:24:03 2023
------
-HDU (extension) information: 'sdss1.fits'.
- Column 1: Index (counting from 0, usable with '--hdu').
- Column 2: Name ('EXTNAME' in FITS standard, usable with '--hdu').
- ('n/a': no name in HDU metadata)
- Column 3: Image data type or 'table' format (ASCII or binary).
- Column 4: Size of data in HDU.
- Column 5: Units of data in HDU (only images).
- ('n/a': no unit in HDU metadata, or HDU is a table)
------
-0 n/a float32 2048x1489 nanomaggy
-1 n/a float32 2048 n/a
-2 n/a table_binary 1x3 n/a
-3 n/a table_binary 1x31 n/a
-$ astfits jplus.fits.fz
-Fits (GNU Astronomy Utilities) @value{VERSION}
-Run on Fri Apr 14 11:21:30 2023
------
-HDU (extension) information: 'jplus.fits.fz'.
- Column 1: Index (counting from 0, usable with '--hdu').
- Column 2: Name ('EXTNAME' in FITS standard, usable with '--hdu').
- ('n/a': no name in HDU metadata)
- Column 3: Image data type or 'table' format (ASCII or binary).
- Column 4: Size of data in HDU.
- Column 5: Units of data in HDU (only images).
- ('n/a': no unit in HDU metadata, or HDU is a table)
------
-0 n/a no-data 0 n/a
-1 IMAGE float32 9216x9232 adu
-2 MASKED_PIXELS int16 9216x9232 n/a
-3 BACKGROUND_MODEL float32 9216x9232 n/a
-4 MASK_MODEL uint8 9216x9232 n/a
-@end example
-Therefore, in order to be able to compare the SDSS and J-PLUS images, we
should first subtract the sky from the J-PLUS image.
-To do that, we can either subtract the @code{BACKGROUND_MODEL} HDU from the
@code{IMAGE} HDU using @ref{Arithmetic}, or we can use @ref{NoiseChisel} to
find a good sky ourselves.
-As scientists we like to tweak and be creative, so let's estimate it ourselves!
-Also, in some cases, you may not have a pre-estimated Sky estimation, so you
should be prepared:
-@example
-$ astnoisechisel jplus-crop.fits --output=jplus-nc.fits
-$ astscript-fits-view jplus-nc.fits
-@end example
-Notice that there is a relatively bright star in the center-bottom of the
image.
-In the ``Cube'' window, click on the ``Next'' button to see the
@code{DETECTIONS} HDU.
-The large footprint of the bright star is obvious.
-Press the ``Next'' button one more time to get to the @code{SKY} HDU.
-You see that in the center-bottom, the footprint of the large star is clearly
visible in the measured Sky level.
-This is not good!
-With Sky values above 54 ADU in the center of the star (the white pixels).
-This over-subtracted Sky level in part of the image will affect your magnitude
measurements and thus the zero point!
-In @ref{General program usage tutorial}, we have a section on @ref{NoiseChisel
optimization for detection}, there is also a full tutorial on this in
@ref{Detecting large extended targets}.
-Therefore, we will not go into the details of NoiseChisel optimization here.
-Given the large images of J-PLUS, we will increase the tile-size to
@mymath{100\times100} pixels and the number of neighbors to identify outlying
tiles to 50 (these are usually the first parameters you should start editing
when you are confronted with a new image).
-After the second command, check the @code{SKY} extension to confirm that there
is no footprint of any bright object there.
-You will still see a gradient, but note the minimum and maximum values of the
Sky level: their difference is more than 26 times smaller than the noise
standard deviation (so statistically speaking, it is pretty flat!)
+@node SAO DS9 region files from table, Viewing FITS file contents with DS9 or
TOPCAT, Generate radial profile, Installed scripts
+@section SAO DS9 region files from table
-@example
-$ astnoisechisel jplus-crop.fits --output=jplus-nc.fits \
- --tilesize=100,100 --outliernumngb=50
-$ astscript-fits-view jplus-nc.fits
+Once your desired catalog (containing the positions of some objects) is
created (for example, with @ref{MakeCatalog}, @ref{Match}, or @ref{Table}) it
often happens that you want to see your selected objects on an image for a
feeling of the spatial properties of your objects.
+For example, you want to see their positions relative to each other.
+In this section we describe a simple installed script that is provided within
Gnuastro for converting your given columns to an SAO DS9 region file to help in
this process.
+SAO DS9@footnote{@url{http://ds9.si.edu}} is one of the most common FITS image
visualization tools in astronomy and is free software.
-## Check that the gradient in the sky is statistically negligible.
-$ aststatistics jplus-nc.fits -hSKY --minimum --maximum \
- | awk '@{print $2-$1@}'
-0.32809
-$ aststatistics jplus-nc.fits -hSKY_STD --median
-8.377977e+00
-@end example
+@menu
+* Invoking astscript-ds9-region:: How to call astscript-ds9-region
+@end menu
-We are now ready to find the zero point!
-First, let's run the @command{astscript-zeropoint} with @option{--help} to see
the option names (recall that you can see more details of each option in
@ref{Invoking astscript-zeropoint}).
-For the first time, let's use the script in the most simple state possible.
-We will keep only the essential options: the names of the input and reference
images (and their HDUs), the name of the output, and also two apertures with
radii of 3 arcsec to start with:
+@node Invoking astscript-ds9-region, , SAO DS9 region files from table, SAO
DS9 region files from table
+@subsection Invoking astscript-ds9-region
+
+This installed script will read two positional columns within an input table
and generate an SAO DS9 region file to visualize the position of the given
objects over an image.
+For more on installed scripts please see (see @ref{Installed scripts}).
+This script can be used with the following general template:
@example
-$ astscript-zeropoint --help
-$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
- --refimgs=sdss1.fits,sdss2.fits \
- --output=jplus-zeropoint.fits \
- --refimgszp=22.5,22.5 \
- --refimgshdu=0,0 \
- --aperarcsec=3
-@end example
+## Use the RA and DEC columns of 'table.fits' for the region file.
+$ astscript-ds9-region table.fits --column=RA,DEC \
+ --output=ds9.reg
-The output is a FITS table (because generally, you will give more apertures
and choose the best one based on a higher-level analysis).
-Let's check the output's internal structure with Gnuastro's @command{astfits}
program.
+## Select objects with a magnitude between 18 to 20, and generate the
+## region file directly (through a pipe), each region with radius of
+## 0.5 arcseconds.
+$ asttable table.fits --range=MAG,18:20 --column=RA,DEC \
+ | astscript-ds9-region --column=1,2 --radius=0.5
-@example
-$ astfits jplus-zeropoint.fits
------
-0 n/a no-data 0 n/a
-1 ZEROPOINTS table_binary 1x3 n/a
-2 APER-3 table_binary 321x2 n/a
+## With the first command, select objects with a magnitude of 25 to 26
+## as red regions in 'bright.reg'. With the second command, select
+## objects with a magnitude between 28 to 29 as a green region and
+## show both.
+$ asttable cat.fits --range=MAG_F160W,25:26 -cRA,DEC \
+ | astscript-ds9-region -c1,2 --color=red -obright.reg
+$ asttable cat.fits --range=MAG_F160W,28:29 -cRA,DEC \
+ | astscript-ds9-region -c1,2 --color=green \
+ --command="ds9 image.fits -regions bright.reg"
@end example
-You can see that there are two HDUs in this file.
-The HDU names give a hint, so let's have a look at each extension with
Gnuastro's @command{asttable} program:
+The input can either be passed as a named file, or from standard input (a
pipe).
+Only the @option{--column} option is mandatory (to specify the input table
columns): two columns from the input table must be specified, either by name
(recommended) or number.
+You can optionally also specify the region's radius, width and color of the
regions with the @option{--radius}, @option{--width} and @option{--color}
options, otherwise default values will be used for these (described under each
option).
+
+The created region file will be written into the file name given to
@option{--output}.
+When @option{--output} is not called, the default name of @file{ds9.reg} will
be used (in the running directory).
+If the file exists before calling this script, it will be overwritten, unless
you pass the @option{--dontdelete} option.
+Optionally you can also use the @option{--command} option to give the full
command that should be run to execute SAO DS9 (see example above and
description below).
+In this mode, the created region file will be deleted once DS9 is closed
(unless you pass the @option{--dontdelete} option).
+A full description of each option is given below.
-@example
-$ asttable jplus-zeropoint.fits --hdu=1 -i
---------
-jplus-zeropoint.fits (hdu: 1)
-------- ----- ---- -------
-No.Name Units Type Comment
-------- ----- ---- -------
-1 APERTURE arcsec float32 n/a
-2 ZEROPOINT mag float32 n/a
-3 ZPSTD mag float32 n/a
---------
-Number of rows: 1
---------
-@end example
+@table @option
-@noindent
-As you can see, in the first extension, for each of the apertures you
requested (@code{APERTURE}), there is a zero point (@code{ZEROPOINT}) and the
standard deviation of the measurements on the apertures (@code{ZPSTD}).
-In this case, we only requested one aperture, so it only has one row.
-Now, let's have a look at the next extension:
+@item -h INT/STR
+@item --hdu INT/STR
+The HDU of the input table when a named FITS file is given as input.
+The HDU (or extension) can be either a name or number (counting from zero).
+For more on this option, see @ref{Input output options}.
-@example
-$ asttable jplus-zeropoint.fits --hdu=2 -i
---------
-jplus-zeropoint.fits (hdu: 2)
-------- ----- ---- -------
-No.Name Units Type Comment
-------- ----- ---- -------
-1 MAG-REF f32 float32 Magnitude of reference.
-2 MAG-DIFF f32 float32 Magnitude diff with input.
---------
-Number of rows: 321
---------
-@end example
+@item -c STR,STR
+@itemx --column=STR,STR
+Identifiers of the two positional columns to use in the DS9 region file from
the table.
+They can either be in WCS (RA and Dec) or image (pixel) coordinates.
+The mode can be specified with the @option{--mode} option, described below.
-It contains a table of measurements for the aperture with the least scatter.
-In this case, we only gave one aperture, so it is the same.
-If you give multiple apertures, only the one with least scatter will be
present by default.
-In the @code{MAG-REF} column you see the magnitudes within each aperture on
the reference (SDSS) image(s).
-The @code{MAG-DIFF} column contains the difference of the input (J-PLUS) and
reference (SDSS) magnitudes for each aperture (see @ref{Zero point estimation}).
-The two catalogs, created by the aperture photometry from the SDSS images, are
merged into one so that there are more stars to compare.
-Therefore, no matter how many reference images you provide, there will only be
a single table here.
-If the two SDSS images overlapped, each object in the overlap region would
have two rows (one row for the measurement from one SDSS image, and another
from the measurement from the other).
+@item -n STR
+@itemx --namecol=STR
+The column containing the name (or label) of each region.
+The type of the column (numeric or a character-based string) is irrelevant:
you can use both types of columns as a name or label for the region.
+This feature is useful when you need to recognize each region with a certain
ID or property (for example, magnitude or redshift).
-Now that we have obtained the zero point of the J-PLUS image, let's go a
little deeper into lower-level details of how this script operates.
-This will help you better understand what happened and how to interpret and
improve the outputs when you are confronted with a new image and strange
outputs.
+@item -m wcs|img
+@itemx --mode=wcs|org
+The coordinate system of the positional columns (can be either
@option{--mode=wcs} and @option{--mode=img}).
+In the WCS mode, the values within the columns are interpreted to be RA and
Dec.
+In the image mode, they are interpreted to be pixel X and Y positions.
+This option also affects the interpretation of the value given to
@option{--radius}.
+When this option is not explicitly given, the columns are assumed to be in WCS
mode.
-To keep intermediate results the @command{astscript-zeropoint} script keeps
temporary files in a temporary directory and later deletes it (and all the
intermediate products).
-If you like to check the temporary files of the intermediate steps, you can
use @option{--keeptmp} option to not remove them.
+@item -C STR
+@itemx --color=STR
+The color to use for created regions.
+These will be directly interpreted by SAO DS9 when it wants to open the region
file so it must be recognizable by SAO DS9.
+As of SAO DS9 8.2, the recognized color names are @code{black}, @code{white},
@code{red}, @code{green}, @code{blue}, @code{cyan}, @code{magenta} and
@code{yellow}.
+The default color (when this option is not called) is @code{green}
-Let's take a closer look into the contents of each HDU.
-First, we'll use Gnuastro’s @command{asttable} to see the measured zeropoint
for this aperture.
-We are using @option{-Y} to have human-friendly (non-scientific!) numbers
(which are sufficient here) and @option{-O} to also show the metadata of each
column at the start.
+@item -w INT
+@itemx --width=INT
+The line width of the regions.
+These will be directly interpreted by SAO DS9 when it wants to open the region
file so it must be recognizable by SAO DS9.
+The default value is @code{1}.
-@example
-$ asttable jplus-zeropoint.fits -Y -O
-# Column 1: APERTURE [arcsec,f32,] Aperture used.
-# Column 2: ZEROPOINT [mag ,f32,] Zero point (sig-clip median).
-# Column 3: ZPSTD [mag ,f32,] Zero point Standard deviation.
-3.000 26.435 0.057
-@end example
+@item -r FLT
+@itemx --radius=FLT
+The radius of all the regions.
+In WCS mode, the radius is assumed to be in arc-seconds, in image mode, it is
in pixel units.
+If this option is not explicitly given, in WCS mode the default radius is 1
arc-seconds and in image mode it is 3 pixels.
-@noindent
-Now, let's have a look at the first 10 rows of the second (@code{APER-3})
extension.
-From the previous check we did above, we see that it contains 321 rows!
+@item --dontdelete
+If the output file name exists, abort the program and do not over-write the
contents of the file.
+This option is thus good if you want to avoid accidentally writing over an
important file.
+Also, do not delete the created region file when @option{--command} is given
(by default, when @option{--command} is given, the created region file will be
deleted after SAO DS9 closes).
-@example
-$ asttable jplus-zeropoint.fits -Y -O --hdu=APER-3 --head=10
-# Column 1: MAG-REF [f32,f32,] Magnitude of reference.
-# Column 2: MAG-DIFF [f32,f32,] Magnitude diff with input.
-16.461 30.035
-16.243 28.209
-15.427 26.427
-20.064 26.459
-17.334 26.425
-20.518 26.504
-17.100 26.400
-16.919 26.428
-17.654 26.373
-15.392 26.429
-@end example
+@item -o STR
+@itemx --output=STR
+Write the created SAO DS9 region file into the name given to this option.
+If not explicitly given on the command-line, a default name of @file{ds9.reg}
will be used.
+If the file already exists, it will be over-written, you can avoid the
deletion (or over-writing) of an existing file with the @option{--dontdelete}.
-But the table above is hard to interpret, so let's plot it.
-To do this, we'll use the same @command{astscript-fits-view} command above
that we used for images.
-It detects if the file has a image or table HDU and will call DS9 or TOPCAT
respectively.
-You can also use any other plotter you like (TOPCAT is not part of Gnuastro),
this script just calls it.
+@item --command="STR"
+After creating the region file, run the string given to this option as a
command-line command.
+The SAO DS9 region command will be appended to the end of the given command.
+Because the command will mostly likely contain white-space characters it is
recommended to put the given string in double quotations.
+
+For example, let's assume @option{--command="ds9 image.fits -zscale"}.
+After making the region file (assuming it is called @file{ds9.reg}), the
following command will be executed:
@example
-$ astscript-fits-view jplus-zeropoint.fits --hdu=APER-3
+ds9 image.fits -zscale -regions ds9.reg
@end example
-After @code{TOPCAT} opens, you can select the ``Graphics'' menu and then
``Plain plot''.
-This will show a plot with the SDSS (reference image) magnitude on the
horizontal axis and the difference of magnitudes between the the input and
reference (the zero point) on the vertical axis.
+You can customize all aspects of SAO DS9 with its command-line options,
therefore the value of this option can be as long and complicated as you like.
+For example, if you also want the image to fit into the window, this option
will be: @command{--command="ds9 image.fits -zscale -zoom to fit"}.
+You can see the SAO DS9 command-line descriptions by clicking on the ``Help''
menu and selecting ``Reference Manual''.
+In the opened window, click on ``Command Line Options''.
+@end table
-In an ideal world, the zero point should be independent of the magnitude of
the different stars that were used.
-Therefore, this plot should be a horizontal line (with some scatter as we go
to fainter stars).
-But as you can see in the plot, in the real world, this expected behavior is
seen only for stars with magnitudes about 16 to 19 in the reference SDSS images.
-The stars that are brighter than 16 are saturated in one (or both)
surveys@footnote{To learn more about saturated pixels and recognition of the
saturated level of the image, please see @ref{Saturated pixels and Segment's
clumps}}.
-Therefore, they do not have the correct magnitude or mag-diff.
-You can check some of these stars visually by using the blinking command above
and zooming into some of the brighter stars in the SDSS images.
-@cindex Depth of data
-On the other hand, it is natural that we cannot measure accurate magnitudes
for the fainter stars because the noise level (or ``depth'') of each image is
limited.
-As a result, the horizontal line becomes wider (scattered) as we go to the
right (fainter magnitudes on the horizontal axis).
-So, let's limit the range of used magnitudes from the SDSS catalog to
calculate a more accurate zero point for the J-PLUS image.
-For this reason, we have the @option{--magnituderange} option in
@command{astscript-zeropoint}.
-@cartouche
-@noindent
-@strong{Necessity of sky subtraction:}
-To obtain this horizontal line, it is very important that both your images
have been sky subtracted.
-Please, repeat the last @command{astscript-zeropoint} command above only by
changing the input file to @file{jplus-crop.fits}.
-Then use Gnuastro’s @command{astscript-fits-view} again to draw a plot with
@code{TOPCAT} (also same as above).
-Instead of a horizontal line, you will see @emph{a sloped line} in the
magnitude range above!
-This happens because the sky level acts as a source of constant signal in all
apertures, so the magnitude difference will not be independent of the star's
magnitude, but dependent on it (the measurement on a fainter star will be
dominated by the sky level).
-@strong{Remember:} if you see a sloped line instead of a horizontal line, the
input or reference image(s) are not sky subtracted.
-@end cartouche
+@node Viewing FITS file contents with DS9 or TOPCAT, Zero point estimation,
SAO DS9 region files from table, Installed scripts
+@section Viewing FITS file contents with DS9 or TOPCAT
-Another key parameter of this script is the aperture size
(@option{--aperarcsec}) for the aperture photometry of images.
-On one hand, if the selected aperture is too small, you will be at the mercy
of the differing PSFs between your input and reference image(s): part of the
light of the star will be lost in the image with the worse PSF.
-On the other hand, with large aperture size, the light of neighboring objects
(stars/galaxies) can affect the photometry.
-We should select an aperture radius of the same order than the one used in the
reference image, typically 2 to 3 times the PSF FWHM of the images.
-For now, let's assume the values 2, 3, 4, 5, and 6 arcsec for the aperture
sizes parameter.
-The script will compare the result for several aperture sizes and choose the
one with least standard deviation value, @code{ZPSTD} column of the
@code{ZEROPOINTS} HDU.
+@cindex Multi-Extension FITS
+@cindex Opening multi-extension FITS
+The FITS definition allows for multiple extensions (or HDUs) inside one FITS
file.
+Each HDU can have a completely independent dataset inside of it.
+One HDU can be a table, another can be an image and another can be another
independent image.
+For example, each image HDU can be one CCD of a multi-CCD camera, or in
processed images one can be the deep science image and the next can be its
weight map, alternatively, one HDU can be an image, and another can be the
catalog/table of objects within it.
-Let's re-run the script with the following changes:
-@itemize
-@item
-Using @option{--magnituderange} to limit the stars used for estimating the
zero point.
-@item
-Giving more values for aperture size to find the best for these two images as
explained above.
-@item
-Call @option{--keepzpap} option to keep the result of matching the catalogs
done with the selected apertures in the different extensions of the output file.
-@end itemize
+The most common software for viewing FITS images is SAO DS9 (see @ref{SAO
DS9}) and for plotting tables, TOPCAT is the most commonly used tool in
astronomy (see @ref{TOPCAT}).
+After installing them (as described in the respective appendix linked in the
previous sentence), you can open any number of FITS images or tables with DS9
or TOPCAT with the commands below:
@example
-$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
- --refimgs=sdss1.fits,sdss2.fits \
- --output=jplus-zeropoint.fits \
- --refimgszp=22.5,22.5 \
- --aperarcsec=2,3,4,5,6 \
- --magnituderange=16,18 \
- --refimgshdu=0,0 \
- --keepzpap
+$ ds9 image-a.fits image-b.fits
+$ topcat table-a.fits table-b.fits
@end example
-Now, check number of HDU extensions by @command{astfits}.
+But usually the default mode is not enough.
+For example, in DS9, the window can be too small (not covering the height of
your monitor), you probably want to match and lock multiple images, you have a
favorite color map that you prefer to use, or you may want to open a
multi-extension FITS file as a cube.
-@example
-$ astfits jplus-zeropoint.fits
------
-0 n/a no-data 0 n/a
-1 ZEROPOINTS table_binary 5x3 n/a
-2 APER-2 table_binary 319x2 n/a
-3 APER-3 table_binary 321x2 n/a
-4 APER-4 table_binary 323x2 n/a
-5 APER-5 table_binary 323x2 n/a
-6 APER-6 table_binary 325x2 n/a
-@end example
+Using the simple commands above, you need to manually do all these in the DS9
window once it opens and this can take several tens of seconds (which is enough
to distract you from what you wanted to inspect).
+For example, if you have a multi-extension file containing 2D images, one way
to load and switch between each 2D extension is to take the following steps in
the SAO DS9 window: @clicksequence{``File''@click{}``Open Other''@click{}``Open
Multi Ext Cube''} and then choose the Multi extension FITS file in your
computer's file structure.
-You can see that the output file now has a separate HDU for each aperture
(thanks to @option{--keepzpap}.)
-The @code{ZEROPOINTS} hdu contains the final zero point values for each
aperture and their error.
-The best zero point value belongs to the aperture that has the least scatter
(has the lowest standard deviation).
-The rest of extensions contain the zero point value computed within each
aperture (as discussed above).
+@cindex @option{-mecube} (DS9)
+The method above is a little tedious to do every time you want view a
multi-extension FITS file.
+A different series of steps is also necessary if you the extensions are 3D
data cubes (since they are already cubes, and should be opened as multi-frame).
+Furthermore, if you have multiple images and want to ``match'' and ``lock''
them (so when you zoom-in to one, all get zoomed-in) you will need several
other sequence of menus and clicks.
-Let's check the different tables by plotting all magnitude tables at the same
time with @code{TOPCAT}.
+Fortunately SAO DS9 also provides command-line options that you can use to
specify a particular behavior before/after opening a file.
+One of those options is @option{-mecube} which opens a FITS image as a
multi-extension data cube (treating each 2D extension as a slice in a 3D cube).
+This allows you to flip through the extensions easily while keeping all the
settings similar.
+Just to avoid confusion, note that SAO DS9 does not follow the GNU style of
separating long and short options as explained in @ref{Arguments and options}.
+In the GNU style, this `long' (multi-character) option should have been called
like @option{--mecube}, but SAO DS9 follows its own conventions.
+
+For example, try running @command{$ds9 -mecube foo.fits} to see the effect
(for example, on the output of @ref{NoiseChisel}).
+If the file has multiple extensions, a small window will also be opened along
with the main DS9 window.
+This small window allows you to slide through the image extensions of
@file{foo.fits}.
+If @file{foo.fits} only consists of one extension, then SAO DS9 will open as
usual.
+
+On the other hand, for visualizing the contents of tables (that are also
commonly stored in the FITS format), you need to call a different software
(most commonly, people use TOPCAT, see @ref{TOPCAT}).
+And to make things more inconvenient, by default both of these are only
installed as command-line software, so while you are navigating in your GUI,
you need to open a terminal there, and run these commands.
+All of the issues above are the founding purpose of the installed script that
is introduced in @ref{Invoking astscript-fits-view}.
+
+@menu
+* Invoking astscript-fits-view:: How to call this script
+@end menu
+
+@node Invoking astscript-fits-view, , Viewing FITS file contents with DS9 or
TOPCAT, Viewing FITS file contents with DS9 or TOPCAT
+@subsection Invoking astscript-fits-view
+
+Given any number of FITS files, this script will either open SAO DS9 (for
images or cubes) or TOPCAT (for tables) to visualize their contents in a
graphic user interface (GUI).
+For more on installed scripts please see (see @ref{Installed scripts}).
+This script can be used with the following general template:
@example
-$ astscript-fits-view jplus-zeropoint.fits
+$ astscript-fits-view [OPTION] input.fits [input-b.fits ...]
@end example
@noindent
-After @code{TOPCAT} has opened take the following steps:
-@enumerate
-@item
-From the ``Graphics'' menu, select ``Plain plot''.
-You will see the last HDU's scatter plot open in a new window (for
@code{APER-6}, with red points).
-The Bottom-left panel has the logo of a red-blue scatter plot that has written
@code{6:jplus-zeropoint.fits} in front of it (showing that this is the 6th HDU
of this file).
-In the bottom-right panel, you see the names of the columns that are being
displayed.
-@item
-In the ``Layers'' menu, Click on ``Add Position Control''.
-On the bottom-left panel, you will notice that a new blue-red scatter plot has
appeared but it just says @code{<no table>}.
-In the bottom-right panel, in front of ``Table:'', select any other extension.
-This will plot the same two columns of that extension as blue points.
-Zoom-in to the region of the horizontal line to see/compare the different
scatters.
-
-Change the HDU given to ``Table:'' and see the distribution of zero points for
the different apertures.
-@end enumerate
-
-The manual/visual operation above is critical if this is your first time with
a new dataset (it shows all kinds of systematic biases (like the Sky issue
above)!
-But once you know your data has no systematic biases, choosing between the
different apertures is not easy visually!
-Let's have a look at the table the @code{ZEROPOINTS} HDU (we don't need to
explicitly call this HDU since it is the first one):
+One line examples
@example
-$ asttable jplus-zeropoint.fits -O -Y
-# Column 1: APERTURE [arcsec,f32,] Aperture used.
-# Column 2: ZEROPOINT [mag ,f32,] Zero point (sig-clip median).
-# Column 3: ZPSTD [mag ,f32,] Zero point Standard deviation.
-2.000 26.405 0.028
-3.000 26.436 0.030
-4.000 26.448 0.035
-5.000 26.458 0.042
-6.000 26.466 0.056
+## Call TOPCAT to load all the input FITS tables.
+$ astscript-fits-view table-*.fits
+
+## Call SAO DS9 to open all the input FITS images.
+$ astscript-fits-view image-*.fits
@end example
-The most accurate zero point is the one where @code{ZPSTD} is the smallest.
-In this case, minimum of @code{ZPSTD} is with radii of 2 and 3 arcseconds.
-Run the @command{astscript-fits-view} command above again to open TOPCAT.
-Let's focus on the magnitude plots in these two apertures and determine a more
accurate range of magnitude.
-The more reliable option is the range between 16.4 (where we have no saturated
stars) and 18.5 mag (fainter than this, the scatter becomes too strong).
-Finally, let's set some more apertures between 2 and 3 arcseconds radius:
+This script will use Gnuastro's @ref{Fits} program to see if the file is a
table or image.
+If the first input file contains an image HDU, then the sequence of files will
be given to @ref{SAO DS9}.
+Otherwise, the input(s) will be given to @ref{TOPCAT} to visualize (plot) as
tables.
+When opening DS9 it will also inspect the dimensionality of the first image
HDU of the first input and open it slightly differently when the input is 2D or
3D:
-@example
-$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
- --refimgs=sdss1.fits,sdss2.fits \
- --output=jplus-zeropoint.fits \
- --magnituderange=16.4,18.5 \
- --refimgszp=22.5,22.5 \
- --aperarcsec=2,2.5,3,3.5,4 \
- --refimgshdu=0,0 \
- --keepzpap
+@table @asis
+@item 2D
+DS9's @option{-mecube} will be used to open all the 2D extensions of each
input file as a ``Multi-extension cube''.
+A ``Cube'' window will also be opened with DS9 that can be used to slide/flip
through each extensions.
+When multiple files are given, each file will be in one ``frame''.
-$ asttable jplus-zeropoint.fits -Y
-2.000 26.405 0.037
-2.500 26.425 0.033
-3.000 26.436 0.034
-3.500 26.442 0.039
-4.000 26.449 0.044
+@item 3D
+DS9's @option{-multiframe} option will be used to open all the extensions in a
separate ``frame'' (since each input is already a 3D cube, the @option{-mecube}
option can be confusing).
+To flip through the extensions (while keeping the slice fixed), click the
``frame'' button on the top row of buttons, then use the last four buttons of
the bottom row ("first", "previous", "next" and "last") to change between the
extensions.
+If multiple files are given, there will be a separate frame for each HDU of
each input (each HDU's name or number will be put in square brackets after its
name).
+@end table
+
+@cartouche
+@noindent
+@strong{Double-clicking on FITS file to open DS9 or TOPCAT:} for those graphic
user interface (GUI) that follow the freedesktop.org standards (including
GNOME, KDS Plasma, or Xfce) Gnuastro installs a @file{fits-view.desktop} file
to instruct your GUI to call this script for opening FITS files when you click
on them.
+To activate this feature take the following steps:
+@enumerate
+@item
+Run the following command, while replacing @code{PREFIX}.
+If you do not know what to put in @code{PREFIX}, run @command{which astfits}
on the command-line, and extract @code{PREFIX} from the output (the string
before @file{/bin/astfits}).
+For more, see @ref{Installation directory}.
+@example
+ln -sf PREFIX/share/gnuastro/astscript-fits-view.desktop \
+ ~/.local/share/applications/
@end example
+@item
+Right-click on a FITS file, and choose these items in order (based on GNOME,
may be different in KDE or Xfce): @clicksequence{``Open with other
application''@click{}``View all applications''@click{}``astscript-fits-view''}.
+@end enumerate
+@end cartouche
-The aperture with the least scatter is therefore the 2.5 arcsec radius
aperture, giving a zero point of 26.425 magnitudes for this image.
-However, you can see that the scatter for the 3 arcsec aperture is also
acceptable.
-Actually, the @code{ZPSTD} for of the 2.5 and 3 arcsec apertures only have a
difference of @mymath{3\%} (@mymath{= (0.034−0.0333)/0.033\times100}).
-So simply choosing the minimum is just a first-order approximation (which is
accurate within @mymath{26.436−26.425=0.011} magnitudes)
-Note that in aperture photometry, the PSF plays an important role (because the
aperture is fixed but the two images can have very different PSFs).
-The aperture with the least scatter should also account for the differing PSFs.
-Overall, please, always check the different and intermediate steps to make
sure the parameters are the good so the estimation of the zero point is correct.
+@noindent
+This script takes the following options
-If you are happy with the minimum, you don't have to search for the minimum
aperture or its corresponding zero point yourself.
-This script has written it in @code{ZPVALUE} keyword of the table.
-With the first command, we also see the name of the file also, (you can use
this on many files for example).
-With the second command, we are only printing the number by adding the
@option{-q} (or @option{--quiet}) option (this is useful in a script where you
want to write the value in a shell variable to use later).
+@table @option
-@example
-$ astfits jplus-zeropoint.fits --keyvalue=ZPVALUE
-jplus-zeropoint.fits 2.642512e+01
+@item -h STR
+@itemx --hdu=STR
+The HDU (extension) of the input dataset to display.
+The value can be the HDU name or number (the first HDU is counted from 0).
-$ astfits jplus-zeropoint.fits --keyvalue=ZPVALUE -q
-2.642512e+01
-@end example
+@item -p STR
+@itemx --prefix=STR
+Directory to search for SAO DS9 or TOPCAT's executables (assumed to be
@command{ds9} and @command{topcat}).
+If not called they will be assumed to be present in your @file{PATH} (see
@ref{Installation directory}).
+If you do not have them already installed, their installation directories are
available in @ref{SAO DS9} and @ref{TOPCAT} (they can be installed in
non-system-wide locations that do not require administrator/root permissions).
-Generally, this script will write the following FITS keywords (all starting
with @code{ZP}) for your future reference in its output:
+@item -s STR
+@itemx --ds9scale=STR
+The string to give to DS9's @option{-scale} option.
+You can use this option to use a different scaling.
+The Fits-view script will place @option{-scale} before your given string when
calling DS9.
+If you do not call this option, the default behavior is to cal DS9 with:
@option{-scale mode zscale} or @option{--ds9scale="mode zscale"} when using
this script.
-@example
-$ astfits jplus-zeropoint.fits -h1 | grep ^ZP
-ZPAPER = 2.5 / Best aperture.
-ZPVALUE = 26.42512 / Best zero point.
-ZPSTD = 0.03276644 / Best std. dev. of zeropoint.
-ZPMAGMIN= 16.4 / Min mag for obtaining zeropoint.
-ZPMAGMAX= 18.5 / Max mag for obtaining zeropoint.
-@end example
+The Fits-view script has the following aliases to simplify the calling of this
option (and avoid the double-quotations and @code{mode} in the example above):
-Using the @option{--keyvalue} option of the @ref{Fits} program, you can easily
get multiple of the values in one run (where necessary):
+@table @option
+@item zscale
+or @option{--ds9scale=zscale} equivalent to @option{--ds9scale="mode zscale"}.
+@item minmax
+or @option{--ds9scale=minmax} equivalent to @option{--ds9scale="mode minmax"}.
+@end table
-@example
-$ astfits jplus-zeropoint.fits --hdu=1 --quiet \
- --keyvalue=ZPAPER,ZPVALUE,ZPSTD
-2.500000e+00 2.642512e+01 3.276644e-02
-@end example
+@item -c=FLT,FLT
+@itemx --ds9center=FLT,FLT
+The central coordinate for DS9's view of the FITS image after it opens.
+This is equivalent to the ``Pan'' button in DS9.
+The nature of the coordinates will be determined by the @option{--ds9mode}
option that is described below.
-@node Zero point tutorial with reference catalog, Invoking
astscript-zeropoint, Zero point tutorial with reference image, Zero point
estimation
-@subsection Zero point tutorial with reference catalog
+@item -O img/wcs
+@itemx --ds9mode=img/wcs
+The coordinate system (or mode) to interpret the values given to
@option{--ds9center}.
+This can either be @option{img} (or DS9's ``Image'' coordinates) or
@option{wcs} (or DS9's ``wcs fk5'' coordinates).
-In @ref{Zero point tutorial with reference image}, we explained how to use the
@command{astscript-zeropoint} for estimating the zero point of one image based
on a reference image.
-Sometimes there is not a reference image and we need to use a reference
catalog.
-Fortunately, @command{astscript-zeropoint} can also use the catalog instead of
the image to find the zero point.
+@item -g INTxINT
+@itemx --ds9geometry=INTxINT
+The initial DS9 window geometry (value to DS9's @option{-geometry} option).
-To show this, let's download a catalog of SDSS in the area that overlaps with
the cropped J-PLUS image (used in the previous section).
-For more on Gnuastro's Query program, please see @ref{Query}.
-The columns of ID, RA, Dec and magnitude in the SDSS @emph{r} filter are
called by their name in the SDSS catalog.
+@item -m
+@itemx --ds9colorbarmulti
+Do Not show a single color bar for all the loaded images.
+By default this script will call DS9 in a way that a single color bar is shown
for any number of images.
+A single color bar is preferred for two reasons: 1) when there are a lot of
images, they consume a large fraction of the display area. 2) the color-bars
are locked by this script, so there is no difference between!
+With this option, you can have separate color bars under each image.
+@end table
-@example
-$ astquery vizier \
- --dataset=sdss12 \
- --overlapwith=jplus-crop.fits \
- --column=objID,RA_ICRS,DE_ICRS,rmag \
- --output=sdss-catalog.fits
-@end example
-To visualize the position of the SDSS objects over the J-PLUS image, let's use
@command{astscript-ds9-region} (for more details please see @ref{SAO DS9 region
files from table}) with the command below (it will automatically open DS9 and
load the regions it created):
-@example
-$ astscript-ds9-region sdss-catalog.fits \
- --column=RA_ICRS,DE_ICRS \
- --color=red --width=3 --output=sdss.reg \
- --command="ds9 jplus-nc.fits[INPUT-NO-SKY] \
- -scale zscale"
-@end example
-Now, we are ready to estimate the zero point of the J-PLUS image based on the
SDSS catalog.
-To download the input image and understand how to use the
@command{astscript-zeropoint}, please see @ref{Zero point tutorial with
reference image}.
-Many of the options (like the aperture size) and magnitude range are the same
so we will not discuss them further.
-You will notice that the only substantive difference of the command below with
the last command in the previous section is that we are using @option{--refcat}
instead of @option{--refimgs}.
-There are also some cosmetic differences for example a new output name, not
using @option{--refimgszp} since it is only necessary for images) and the
@option{--*column} options which are used to identify the names of the
necessary columns of the input catalog:
-@example
-$ astscript-zeropoint jplus-nc.fits --hdu=INPUT-NO-SKY \
- --refcat=sdss-catalog.fits \
- --refcatmag=rmag \
- --refcatra=RA_ICRS \
- --refcatdec=DE_ICRS \
- --output=jplus-zeropoint-cat.fits \
- --magnituderange=16.4,18.5 \
- --aperarcsec=2,2.5,3,3.5,4 \
- --keepzpap
-@end example
-@noindent
-Let's inspect the output with the command below.
-@example
-$ asttable jplus-zeropoint-cat.fits -Y
-2.000 26.337 0.034
-2.500 26.386 0.036
-3.000 26.417 0.041
-3.500 26.439 0.043
-4.000 26.455 0.050
-@end example
-As you see, the values and standard deviations are very similar to the results
we got previously in @ref{Zero point tutorial with reference image}.
-The Standard deviations are generally a little higher here because we didn't
do the photometry ourselves, but they are statistically similar.
-Before we finish, let's open the two outputs (from a reference image and
reference catalog) with the command below.
-To confirm how they compare, we are showing the result for @code{APER-3}
extension in both (following the TOPCAT plotting recipe in @ref{Zero point
tutorial with reference image}).
+@node Zero point estimation, Dithering pattern simulation, Viewing FITS file
contents with DS9 or TOPCAT, Installed scripts
+@section Zero point estimation
-@example
-$ astscript-fits-view jplus-zeropoint.fits jplus-zeropoint-cat.fits \
- -hAPER-3
-@end example
+@cindex Zero point
+@cindex Calibration
+@cindex Astrometry
+Through the ``zero point'', we are able to give physical units to the pixel
values of an image (often in units of ``counts'' or ADUs) and thus compare them
with other images (as well as measurements that are done on them).
+The zero point is therefore an important calibration of pixel values (as
astromerty is a calibration of the pixel positions).
+The fundamental concepts behind the zero point are described in
@ref{Brightness flux magnitude}.
+We will therefore not go deeper into the basics here and stick to the
practical aspects of it.
+
+The purpose of Gnuastro’s @command{astscript-zeropoint} script is to obtain
the zero point of an image by considering another image (where the zero point
is already known), or a catalog.
+In the
+The operation involves multiple lower-level programs in a standard series of
steps.
+For example, when using another image, the script will take the following
steps:
+
+@enumerate
+@item
+Download the Gaia catalog that overlaps with the input image using Gnuastro’s
Query program (see @ref{Query}).
+This is done to determine the stars within the image@footnote{Stars have an
almost identical shape in the image (as opposed to galaxies for example), using
confirmed stars will produce a more reliable result.}.
+@item
+Perform aperture photometry@footnote{For a complete tutorial on aperture
photometry, see @ref{Aperture photometry}.} with @ref{MakeProfiles}
@ref{MakeCatalog}.
+We will assume a zero point of 0 for the input image.
+If the reference is an image, then we should perform aperture photometry also
in that image.
+@item
+Match the two catalogs@footnote{For a tutorial on matching catalogs, see
@ref{Matching catalogs}).} with @ref{Match}.
+@item
+The difference between the input and reference magnitudes should be
independent of the magnitude of the stars.
+This does not hold when the stars are saturated in one/both the images (giving
us a bright-limit for the magnitude range to use) or for stars fainter than a
certain magnitude, where the signal-to-noise ratio drops significantly in
one/both images (giving us a faint limit for the magnitude range to use).
+@item
+Since a zero point of 0 was used for the input image, the magnitude difference
above (in the reliable magnitude range) is the zero point of the input image.
+@end enumerate
+In the ``Tutorials'' chapter of this Gnuastro book, there are two tutorials
dedicated to the usage of this script.
+The first uses an image as a reference (@ref{Zero point tutorial with
reference image}) and the second uses a catalog (@ref{Zero point tutorial with
reference catalog}).
+For the full set of options an a detailed description of each, see
@ref{Invoking astscript-zeropoint}.
+@menu
+* Invoking astscript-zeropoint:: How to call the script
+@end menu
-@node Invoking astscript-zeropoint, , Zero point tutorial with reference
catalog, Zero point estimation
+@node Invoking astscript-zeropoint, , Zero point estimation, Zero point
estimation
@subsection Invoking astscript-zeropoint
This installed script will calculate the zero point of an input image to
calibrate it.
The reference can be an image or catalog (which have been previously
calibrated)
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [gnuastro-commits] master 59620759: Book: Zeropoint tutorial brought into the Tutorials chapter,
Mohammad Akhlaghi <=