Thursday, October 7, 2010

GAMS/GDX

GAMS has a binary data file format for which an API is available. In the last few weeks I encountered a number of similar issues with this in different projects.
The interface does require more knowledge of the internals of GAMS than most people have. An example is here. When we want to read a GAMS parameter A(I,J) into a C/Fortran/Delphi matrix A[1..m,1..n] we see this is not completely trivial. The basic obstacle is that GAMS has strings as indices, while most programming languages have simple integers as indices. I.e. we need to map J = /new-york, chicago, topeka/ into j=1..3. For large problems that means some kind of hashing or other way to quickly look up strings. There is also a “raw” mode interface that does not give strings but rather internal GAMS numbers. So we need to map J = /1001,1002,1005/ to j=1..3. This again is not a completely trivial task. If you are familiar with the internals of GAMS one can know that GAMS internal numbers (so-called UEL numbers) are by definition increasing. So one could use a binary search to quickly find numbers. A fortran routine for this can look like:
image
Another possibility would by to have an array of the size of MAXUEL that has mapping information from UEL-> j=1..3. That requires more space but has quick look up. If the GDX file contains only one set, one could use socalled “mapped” mode, but that cannot be used in general.

It is also worth noting that GAMS only stores and exports nonzero elements. So we cannot assume all A(i,j)’s are in the GDX file.

So to recap, my suggested approach to read a matrix A(i,j) from a GDX file is:

  1. Read 1D set I (in raw mode) and store UEL’s in an integer array IA
  2. Read 1D set J (in raw mode) and store UEL’s in an integer array JA
  3. Allocate array A(i,j) and set all elements to zero
  4. Read 2D parameter A (in raw mode). For each record do:
    1. Retrieve (ueli, uelj, aij)
    2. i := bsearch(IA, length(IA), ueli)
    3. j := bsearch(JA, length(JA), uelj)
    4. A[i,j] := aij

If you don’t want to use binary search an alternative algorithm would be:

  1. Retrieve MAXUEL (gdxSystemInfo)
  2. Allocate integer arrays IA and JA of length MAXUEL
  3. Read 1D set I (in raw mode). For each record do:
    1. Retrieve uel
    2. IA[uel] := recordnumber
  4. Read 1D set J into array JA as described in step 3
  5. Read 2D parameter A (in raw mode). For each record do:
    1. Retrieve (ueli, uelj, aij)
    2. i := IA[ueli]
    3. j := JA[uelj]
    4. A[i,j] := aij


Notes:

  • The generated interface files are quite ugly, have strange design and implementation and have no comments on usage. The code in the provided files just looks very awkward. Below are a few more specific issues illustrating this.
  • The filenames are incomprehensible: gdxdocpdef.pas, gdxdcpdef.pas ??? 
  • The functions should return boolean/logical values instead of integers.
  • The previous point is the more crucial as some routines have 0=success and others have a return code <>0 meaning success. That throws me off balance quite often.
  • Fortran support for many compilers is missing.
  • The examples are too simplistic. The above is not explained at all.
  • The generated interface contains some code that I would refuse from my students when I was teaching programming at undergraduate level. An example is:
       n: ShortString
    A symbol n should always be an integer, never a string (or a floating point number).
  • The Delphi interface does not compile under the latest compiler.
Delphi 2010

GAMS users routinely generate very large GDX files with millions of records. To get data in and out of these files efficiently still requires substantial skills and effort.

No comments:

Post a Comment