How to HITRAN line-by-line

This is a basic introductory guide on how to write a line-by-line calculation routine using the HITRAN database which is widely used for the calculation of atmospheric and molecular absorption spectra of the most abundant atmospheric gases (see list here) for given input parameters of the desired temperature, total pressure and molecular concentrations.   This guide is a practical supplement to the information provided by the HITRAN group at this link which contains a list of the HITRAN database parameters, their detailed description and the most relevant calculation formulas.

The line-by-line calculation procedure involves computation of absorption for individual spectral line data records provided in the HITRAN database over a desired spectral range and summing these calculated single spectral line absorption profiles together to form the total absorption spectrum as a function of wavenumber (or wavelength).

What is supplied by HITRAN?

The HITRAN database supplies all data needed to perform line-by-line calculations except for the atmospheric model profiles which are described in more detail further.   The supplied data include the
(1)   HITRAN database data file (traditionally an ASCII file with the PAR extension) which may now be obtained from HITRANonline,
(2)   the molparam.txt file which contains reference data such as the molecular mass etc. for each isotope in the HITRAN compilation, and
(3)   the files with the Total Internal Partition Sums (TIPS) for different temperatures in the "q_Isotope_Global_Id.txt" format available here.

HITRAN database records

The 2004 - 2012 year ASCII edition of the HITRAN database consists of 160 character records for each single spectral line.   Below is the listing of the fields for each HITRAN database spectral line record with their corresponding ASCII character field lengths.   This table is based on that provided at as well as Table 2 in the HITRAN 2004 edition paper.   Not all data supplied in each HITRAN database record is required to run line-by-line calculations.   The parameters required to run the line-by-line calculation using the Voigt profile are indicated with a tick mark.

- single record parameters used in calculations, - parameters not used in calculations.

      Symbol Length Units Description
  $M$ 2 - Molecule number
  $I$ 1 - Isotope number
  $\nu_{ij}$ 12 $cm^{-1}$ Vacuum wavenumber
  $S_{ij}$ 10 $cm^{-1} / (molecule \cdot cm^{-2})$ Line intensity
  $A$ 10 $s^{-1}$ Einstein A-coefficient
  $\gamma_{air}$ 5 $cm^{-1} / atm$ Air-broadened half width at half maximum (HWHM)
  $\gamma_{self}$ 5 $cm^{-1} / atm$ Self-broadened half width at half maximum (HWHM)
  $E^{ \prime\prime}$ 10 $cm^{-1}$ Lower-state energy
  $n_{air}$ 4 $cm^{-1}$ Temperature dependence exponent for $\gamma_{air}$
  $\delta_{air}$ 8 $cm^{-1} / atm$ Air pressure-induced line shift
  $V^{ \prime}$ 15 - Upper-state "global" quanta
  $V^{ \prime\prime}$ 15 - Lower-state "global" quanta
  $Q^{ \prime}$ 15 - Upper-state "local" quanta
  $Q^{ \prime\prime}$ 15 - Lower-state "local" quanta
  $I_{err}$ 6 - Uncertainty indices
  $I_{ref}$ 12 - Reference indices
  $*$ 1 - Line mixing flag
  $g^{\prime}$ 7 - Statistical weight of the upper state
  $g^{\prime\prime}$ 7 - Statistical weight of the lower state
The molparam.txt file

The molparam.txt file contains basic data for each molecule and isotope needed to perform the line-by-line calculations.   These include the molecule number $M$, the Isotope code, terrestrial abundance $B_{T}$, total internal partition sum value at 296K denoted as $Q(296K)$, ground state degeneracy factor $gj$, and the molar mass $W_{g}$ in grams per mole.   The molecule numbers $M$ are indicated in brackets in the molparam.txt after each molecule name followed by the isotopes arranged in ascending order by their isotope number $I$.   Please note that the Isotope code has nothing to do with the isotope number $I$ field in the HITRAN database records.   The second field called $I$ used for isotope numbering in the HITRAN database is simply an index of each isotope starting from 1 in the increasing order of their relative abundance in the Earth atmosphere.   The only exceptions are isotopes 10, 11 and 12 of CO2 which have indexes of 0 (number zero), A, and B respectively.

The TIPS files

The Total Internal Partition Sums (TIPS) files are available here.   Each TIPS file contains 2 columns.   The first column contains the temperature in Kelvin, and the second one the corresponding Total Internal Partition Sum (TIPS) for that temperature $Q_{T}$.   The value of the total internal partition sum at $296K$ denoted as $Q_{296K}$ may be taken from the TIPS files or the molparam.txt file.   The TIPS values are used to recalculate the spectral line intensity from the reference temperature of 296K into the user-specified calculation temperature.

The pre line-by-line calculation sequence

Several parameters need to be calculated prior to running the line-by-line routine.   First of all, the center of the spectral line as indicated in the HITRAN database will change due to pressure.   As such the pressure induced spectral line shift needs to be taken into account.   The shifted spectral line center $ \nu_{ij}^{*}$ due to pressure compared to the unperturbed value specified in the HITRAN database is

$$ \nu_{ij}^{*} = \nu_{ij} + \delta_{air} \cdot P_{total} $$

where $P_{total}$ is the total atmospheric pressure in atmospheres (atm).   The intensity of the spectral line also needs to be recalculated for the user-specified temperature

$$ S_{ij}^{*} = S_{ij} \cdot \left( \frac{Q_{296K}}{Q_{T}} \right) \cdot \frac{exp\left(-c_{2} \cdot E^{\prime\prime} / \, T\right)}{exp\left(-c_{2} \cdot E^{\prime\prime} / \, 296\right)} \cdot \frac{1 - exp\left(-c_{2} \cdot \nu_{ij} \, / \, T\right)}{1 - exp\left(-c_{2} \cdot \nu_{ij} \, / \, 296\right)} $$

where $T$ is the calculation temperature in Kelvin, $Q_{T}$ is the total internal partition sum value for the calculation temperature, $Q_{296K}$ is the total internal partition sum for the temperature of 296K, and $c_{2} = c \cdot h / k $ = 1.43880285 (cm · K) is the second black body radiation constant.   After the above calculation, we need to factor in the pathlength, molecular concentration and the molecule isotopic abundance into the calculated line intensity.

$$ S_{L} = S_{ij}^{*} \cdot \frac{B}{B_{T}} \cdot \frac{296}{T} \cdot N_{L} \cdot L \cdot P_{mol}$$

where $B$ is the isotope abundance in the atmosphere being modelled, $B_{T}$ is the isotope abundance in the terrestrial atmosphere, $T$ is the calculation temperature in degree Kelvin, $N_{L}$= 2.686780524e+19 is the Loschmid's number (in $cm^{-3}$) at a temperature of 296K and 1 atm pressure, $L$ is the pathlength in centimeters, and $P_{mol}$ is the partial pressure in atmospheres (atm) for the molecule with index $M$ being calculated.   For a more detailed description of this and several other derivation steps, please refer to Appendix 1 in this manual.

At the next step the Doppler and Lorentzian lineshape half widths at half maximum (HWHM) in 1/cm need to be found.   The Doppler HWHM is calculated as follows:

$$ \alpha_{doppler} = \frac{\nu_{ij}}{c} \cdot \sqrt{ \frac{2 \cdot N_{A} \cdot k \cdot T \cdot ln(2)}{10^{-3} \cdot W_{g}} } $$

where $T$ is the calculation temperature, $N_{A}$ = 6.02214086e+23 is the Avogadro number (in 1/mol) , $k$ = 1.3806503e-23 is the Boltzman's constant (in $m^{2} \cdot kg \cdot s^{-2} \cdot K^{-1}$),   $c$ is the speed of light (in m/s), $W_{g}$ is the molar mass of the molecule (in g/mol, located in the molparam.txt) for a given molecule $M$.

The Lorentzian HWHM (in 1/cm) is given by the following formula:

$$ \gamma = \left(\dfrac{296}{T}\right)^{n_{air}} \cdot \Big( \gamma_{air} \cdot \left( P_{total} - P_{mol} \right) + \gamma_{self} \cdot P_{mol} \Big) $$

where $\gamma$ is the half width at half maximum for a given line, $P_{total}$ is the total atmospheric pressure in atmospheres (atm), $P_{mol}$ is the partial pressure in atmospheres (atm) corresponding to the molecule being calculated.

Lineshape calculations and the line-by-line

The data in the HITRAN database is supplied in wavenumbers, as such the calculation is typically performed in wavenumbers (cm-1).   Usually one of the first steps in the line-by-line calculation would involve forming a wavenumber grid over the entire calculation range to run the calculation.   The easiest solution is to use an equidistant grid with a fixed wavenumber step.

Some spectral lines with weak line intensities (or the pre-calculated peak absorption) may be left out in the line-by-line calculation to speed up the computation.   The calculation of individual spectral line absorption components is then carried out.   The wavelength calculation span for the individual spectral lines is specified either as an absolute value (in the units of 1/centimeter, nanometers, micron etc.) or in terms of the number of saturated halfwidths (multiples of values of $\alpha_{doppler}$ or $\gamma$ described above).   The section below gives examples of calculating the Dopper, Lorentzian or Voigt profile spectral lines.   These are the most widely used general purpose profiles.

(1)   If the Doppler lineshape profile is desired, it is calculated by varying $\nu$ over the wavenumber span of a single spectral line as follows

$$ f_{doppler}(\nu) = \sqrt{ \frac{ln(2)}{\pi \cdot \alpha_{doppler}^{2}}} \cdot exp \left( - \frac{ \left( \nu - \nu_{ij}^{*} \right)^{2} \cdot ln(2) }{ \alpha_{doppler}^{2} } \right) $$

(2)   When Lorentz lineshape profile is used, it is calculated over the wavenumber range of a single spectral line $ \nu $ using the following equation

$$ f_{lorentz}(\nu) = \frac{1}{\pi} \cdot \frac{\gamma}{\gamma^2 + \left( \nu - \nu_{ij}^{*} \right)^{2} } $$

(3)   For the calculation of the Voigt profile (see this presentation and this thesis) approximations are typically used.   The general form of the Voigt function is

$$ Voigt(x,y) = K(x,y) + iL(x,y) $$ $$ K(x,y) = \frac{y}{\pi} \cdot \int_{-\infty}^{\infty} \frac{exp(-\nu^2)}{y^2 + (x - \nu)^2} d\nu $$ $$ L(x,y) = \frac{1}{\pi} \cdot \int_{-\infty}^{\infty} \frac{(x - \nu) \cdot exp(-\nu^2)}{y^2 + (x - \nu)^2} d\nu $$


$$ x = \frac { \sqrt{ln(2)} \cdot (\nu - \nu_{ij}^{*}) }{ \alpha_{doppler} } ; \qquad y = \frac { \sqrt{ln(2)} \cdot \gamma }{ \alpha_{doppler} } $$

and the Voigt spectral line shape $f_{voigt}(\nu)$ is computed as follows (see this presentation, slides 17-22, please note that our $\alpha_{doppler}$ and $\gamma$ values are HWHM and those in the link are given in full width at half maximum (FWHM)):

$$ f_{voigt}(\nu) = \frac { \sqrt{ln(2)} } { \sqrt{\pi} \cdot \alpha_{doppler} } \cdot Voigt(x,y) $$

The spectral lineshape is now combined together with the calculated line intensity as follows to get the final attenuation value due to a single line in the units of optical depth $O_{d}(\nu)$ as a function of wavenumber which may be further converted into other units: $$ O_{d}(\nu) = S_{L} \cdot f_{voigt}(\nu) $$

Please note that parameter $S_{L}$ in this formula has the units of 1/cm because it already includes the optical pathlength $L$ as shown earlier.   The Voigt function $Voigt(x,y)$ may be calculated using a number of algorithms.   The current version of the bytran software uses the rational approximation algorithm described in Abrarov, S., and Quine, B. "A rational approximation for efficient computation of the Voigt function in quantitative spectroscopy". Journal of Mathematics Research, 7(2), p 163, (2015) available here which also contains the algorithm source code in the Matlab language.   This algorithm is very easy to implement by substituting values of $x$ and $y$ as described above.   The usage of the Voigt profile is sufficient for many applications, however more advanced lineshape models may be needed where higher accuracy is required.

After calculating the spectral absorption profiles due to individual lines as described above, they are added together to obtain the total spectrum by taking into account individual spectral line spans along the wavenumber grid.

Radiance and Instrument functions

The Radiance and the Instrument function calculations in bytran are a direct conversion of the HAPI Python code into the Qt/C++ with some modifications to take into account the specifics of the Qt/C++ environment and program structure.

The radiance calculation involves the application of a simple formula after the high resolution spectrum has been calculated which can be found in a procedure called "radianceSpectrum" of the HAPI source code or else in Appendix 4, page 141 of this manual which presents a better explanation of the physics involved.

The instrument function routines used in bytran are those introduced in the HAPI source code under the "SPECTRAL CONVOLUTION" section.   The explanations below outline the steps and formulas used in the algorithm.   Before the spectral convolution can be performed, a wavenumber array corresponding to the instrument function to be used needs to be generated over the wavenumber range $x$ spanning from $-x_{0}$ to $x_{0}$ with the spectral resolution equivalent to that of the calculated transmission spectrum to be smoothed with the instrument function.   Here $n$ is the number of points in the instrument function array, and $x_{0}$ is equal to the instrumental function wing parameter which has to be larger than the instrument resolution parameter $g$.   The instrument function shape ("y-axis") values $B(x)$ are then generated over the array grid of $x$ values (from $-x_{0}$ to $x_{0}$) using one of the formulas presented below depending on the desired instrument function.

Rectangular   (where $g$ is a slit width or the instrumental resolution):

$$ B(x) = \newcommand\ddfrac[2]{\frac{\displaystyle #1}{\displaystyle #2}} \begin{equation} \begin{cases} \ddfrac {1}{g}, \enspace when \enspace |x| \leqslant \ddfrac {g}{2} \\ \, 0, \enspace when \enspace |x| > \ddfrac {g}{2} \\ \end{cases} \end{equation} $$

Triangular   (where $g$ is the line width equal to the half base of the triangle):

$$ B(x) = \begin{equation} \begin{cases} \ddfrac {1}{g} \cdot \left( 1 - \ddfrac {|x|}{g} \right) , \enspace when \enspace |x| \leqslant g \\ \, 0, \enspace when \enspace |x| > g \\ \end{cases} \end{equation} $$

Gaussian   (where $g/2$ is a Gaussian half-width at half-maximum):

$$ B(x) = \frac {\sqrt{ \ddfrac {ln(2)}{\pi} }} { \ddfrac{g}{2} } \cdot exp \left(-ln(2) \cdot \left( \frac {x}{(g/2)} \right)^2 \right) $$

Dispersion (Lorentz)   (where $g/2$ is a Lorentzian half-width at half-maximum):

$$ B(x) = \frac{g}{2 \pi} \cdot \frac{1} {x^{2} + \left( \ddfrac {g}{2} \right) ^{2}} $$


$$ B(x) = \begin{equation} \begin{cases} \ddfrac{1}{g} \cdot \ddfrac { \left( sin \left( \ddfrac {\pi \cdot x}{g} \right) \right) ^2} { \left( \ddfrac {\pi \cdot x}{g} \right) ^2 }, \enspace when \enspace x \neq 0 \\ 1, \enspace when \enspace x = 0\\ \end{cases} \end{equation} $$

Michelson   (where $1/g$ is the maximum optical path difference):

$$ B(x) = \begin{equation} \begin{cases} \ddfrac{2}{g} \cdot \ddfrac { sin \left( \ddfrac {2\pi \cdot x}{g} \right)} { \left( \ddfrac{2\pi \cdot x}{g} \right) }, \enspace when \enspace x \neq 0 \\ 1, \enspace when \enspace x = 0\\ \end{cases} \end{equation} $$

The instrument function $B(x)$ is then normalized before the convolution procedure

$$ B_{n}(x) = \frac{B(x_{i})}{\sum\limits_{i=1}^n B(x_{i})} $$

The application of the instrument function smoothing to the generated high resolution spectrum is carried out by convolving the generated instrument function array $B_{n}(x)$ with the high resolution spectrum in a "scanning" manner along the wavenumber axis.   For an easy to understand introductory explanation of the convolution procedure these lecture course notes at Cornell University and University of Cape Town may be consulted.

Molecular concentrations and the Atmospheric models

The concentrations of molecules may be specified manually for line-by-line calculations.   However, the standard way to model the atmospheric transmission spectra is by using the molecular concentration profiles, the total pressures and temperatures from atmospheric models.   It is very common to use the US Standard 1986 Atmospheric model described in this document even though other model profiles are widely available.   There are 6 models available in the 1986 US Standard: Tropical, Midlatitude summer, Midlatitude winter, Subarctic summer, Subarctic winter, and the US Standard.   Even though these standard profiles do not include data for all molecules currently available in the HITRAN database, they provide values for the most abundant molecules which is sufficient for many applications. For more exotic calculation scenarios, the unavailable values need to be supplemented from other sources. -|- 2016